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CYTOSKELETON-ASSOCIATED PROTEINS 

TECHNICAL FIELD 
This mvention relates to nucleic add and amino acid sequences of cytoskeleton-assodated 
5 proteins and to the use of these sequences in the diagnosis, treatment, and prevention of cell 

proliferative disorders, viral infections, and neurological disorders, and in the assessment of the effects 
of exogenous compounds on the expression of nucleic add and amino acid sequences of cyto^eleton- 
associated proteins. 

10 BACKGROUND OF THE INVENTION 

The cytoskeleton is a cytoplasmic network of protein fibers that mediate cell shape, structure, 
and movement Hie cytoskeleton supports the cell membrane and forms tracks along wbidi 
organelles and other elements move in the cytosoL The cytoskeleton is a dynamic structure that 
allows cells to adopt various shapes and to cany out directed movements. Major cytoskeletal fibers 

15 include the microtubules, the microfilaments, and the intermediate filaments. Motor proteins, including 
myosin, dynein, and kinesin, drive movement of or along the fibers. The motor protein dynamin drives 
the formation of membrane vesicles. Accessory or associated proteins modify the structure or activity 
of the fibers while cytoskeletal membrane anchors connect the fibers to the cell membrane. 
Microtubules and Associated Proteins 

20 Tubulins 

Microtubules, cytoskeletal fibers with a diameter of about 24 nm, have multiple roles in the 
ceU. Bundles of microtubules form cilia and flagella, which are whip-like extensions of the cell 
membrane that are necessary for sweeping materials across an epithelium and for sw imming of 
sperm, respectively. Marginal hands of microtubules in red blood cells and platelets are important for 

25 these cells' pliability. Organelles, membrane vesicles, and proteins are transported in the cell along 

tracks of microtubules. For example, microtubules run through nerve cell axons, allowing bi-directiona 
transport of materials and membrane vesicles between the cell body and the nerve terminal Failure to 
supply the nerve terminal with these vesicles blocks the transmission of neural signals. Microtubules 
are also critical to chromosomal movement during cell division. Both stable and short-lived populations 

30 of nucrotubules exist in the cell 

Microtubules are polymers of GTP-binding tubulin protein subunits. Each subunit is a 
heterodimer of a- and P- tubulin, multiple isoforms of which exist. The hydrolysis of GTP is linked .to 
the addition of tubulin subunits at the end of a microtubule. The subunits interact head to tail to form 
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protofilaments; the protofilaments interact side to side to fonn a microtubiile. A tnicrotabule is 
polarized, one end ringed with a-tabnlin and the other with P-tubnlin» and the two ends differ in their 
rates of assembly. Generally, each microtubule is composed of 13 protofilaments although 11 or 15 
protofilament-microtubules are sometimes found. Qlia and flageUa contain doublet microtubules. 
5 Microtubules grow from specialized structures known as centrosomes or microtubul&organizing 
centers (MTOCs). MTOCs may contain one or two centrioles, which are pinwheel arrays of triplet 
microtubules. The basal body, the organizing center located at the base of a cilium or flagellum, 
contains one centriole. Gamma tubulin present in the MTOC is important for nucleating the 
polymerization of a- and tubulin heterodimers but does not polymerize into microtubules. 

10 Microtubule-Associated Proteins 

Microtubule-associated proteins (M APs) have roles in the assembly and stabilization of 
microtubules. One major family of MAFs, assembly MAPs, can be identified in neurons as well as 
non-neuronal cells. Assembly MAPs are responsible for cross-linking microtubules in the cytosol. 
These MAPs are organized into two domains: a basic microtubule-binding domain and an acidic 

15 projection domain. The projection domain is the binding site for membranes, intermediate filaments, or 
other microtubules. Based on sequence analysis, assembly MAPs can be further grouped into two 
types: Type I and Type H. Type I MAPs, which include MAPIA and MAPIB, are large, filamentous 
molecules that co-purify with microtubules and are abundantly ej^ressed in brain and testes. Type I 
MAPs contain several repeats of a positively-charged amino acid sequence motif that binds and 

20 neutralizes negatively charged tubulin, leading to stabilization of microtubules. MAPIA and MAPIB 
are each derived from a single precursor polypeptide that is subsequently proteolytically processed to 
generate one heavy chain and one light chain. 

Another light chain, LC3, is a 16.4 kDa molecule that binds MAPIA, MAPIB, and 
microtubules. It is suggested that JJC3 is synthesized from a source other than the MAPIA or 

25 MAPIB transcripts, and that the expression of LC3 may be important in regulating the microtubule 
binding activity of MAPIA and MAPIB during cell proliferation (Mann, S.S. et al (1994) J. Biol 
C3iem. 269:11492-11497). 

Type n MAPs, which include MAP2a, MAP2b, MAP2c, MAP4, and Tau, are characterized 
by three to four copies of an 18-residue sequence in the microtubule-binding domain. MAP2a, 

30 MAP2b, and MAP2c are found ouly in dendrites, MAP4 is found in non-neuronal cells, and Tau is 
found in axons and dendrites of nerve cells. Alternative splicing of the Tau mRNA leads to the 
existence of multiple forms of Tau protein. Tau phosphorylation is altered in neurodegraierative 
disorders such as Alzheimer's disease, Pick's disease, progressive supranuclear palsy, corticobasal 
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degeneration, and familial fix>ntotemporal dementia and Paikinsonism linked to chromosome 17. The 
altered Tan phosphorylation leads to a collapse of the microtubule network and the formation of 
intranenronal Tau aggregates (Spillantini, M.G. and M. Goedert (1998) Trends Neurosci. 21:428-433). 
The protein pericentrih is found in the MTOC and has a role in microtubnle assembly. 
5 Microfilaments and Associated Proteins 
Actins 

Microfilaments, cytoskeletal filaments with a diameter of about 7-9 nm, are vital to cell 
locomotion, cell shape, cell adhesion, cell division, and muscle contraction. Assembly and disassembly 
of the microfilaments allow cells to change their morphology. Microfilaments are the polymerized 

10 form of actin, the most abundant intracellular protem in the eukaryotic cell Human cells contain six 
isoforms of actin. The three a-actins ate found in different kinds of muscle, nomnuscle ^-actin and 
nonmuscle y-actin are found in nomnuscle cells, and another y-actin is found in intestinal smooth 
muscle cells. G-actin, the monomeric form of actin, polymerizes into polarized, helical F-actin 
filaments, accompanied by the hydrolysis of ATP to ADP. Actin filaments associate to form bundles 

15 and networks, providii^ a framework to support flie plasma membrane and determine cell shape. 
These bundles and networks are coimected to the cell membrane. In muscle cells, thin filaments 
containing actin slide past thick filaments containing the motor protein myosin during contraction. A 
family of actin-related proteins exist that are not part of the actin cytoskeleton, but rather associate 
with microtubules and dynein. 

20 Actin-Associated Proteins 

Actin-associated proteins have roles in cross-linking, severing, and stabilization of actin 
filaments and in sequestering actin monomers. Several of the actin-associated proteins have multiple 
functions. Bundles and networks of actin fiDiaments are held together by actin cross-linking proteins. 
These proteins have two actin-binding sites, one for each filament Short cross-linking proteins 

25 promote bundle formation while longer, more flexible cross-linking proteins promote network 
formation. Cahnodulin-like calcium-binding domains in actin cross-linking proteins allow calcium 
regulation of cross-linking. Group I cross-linking proteins have unique actin-binding domains and 
include the 30 kD protein, £F-la, fascin, and scruin. Group n cross-lkiking proteins have a 7,000-MW 
actin-binding domain and include villin and dematin. Group m cross-linking proteins have pairs of a 

30 26,000-MW actin-binding domain and include fibcnbrin, spectrin, dystrophin, ABP 120, and filamin. 

Severing proteins regulate the length of actin filaments by breaking them into short pieces or 
by blockmg their ends. Severing proteins include gCAP39, severin (fragmin), gelsolin, and villin. 
Capping proteins can cap the ends of actin filaments, but cannot break filaments. Capping proteins 
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include CapZ and tropomodulin. The proteins thymosin and profilin sequestier actin monomers in the 
cytosol» allowing a pool of nnpolymeiized actin to exist The actin-associatBd proteins tropomyosin, 
troponin, and caldesmon regulate muscle contraction in response to caldunL 

Microtubule and actin filament networks cooperate in processes such as vesicte and organelle 

5 transport, cleavage furrow placement, directed cell migration, spindle rotation, and nuclear migration. 
Microtubules and actin may coordinate to transport vesicles, organelles, and cell fete determinants, or 
transport may involve targeting and capture of microtubule ends at cortical actin sites. These 
cytoskeletal systems may be bridged by myosin-Mnesin complexes, myosin-CLIP170 complexes, 
formin-homology (FH) proteins, dynein, the dynactin complex, Kar9p, coronin, ERM proteins, and 

10 kelch repeat-containing proteins (for a review, see Goode, B.L. et aL (2000) Curr. Opin. Cell Biol. 
12:63-71). The kelch repeat is a motif originally observed in the kelch protein, which is involved in 
formation of cytoplasmic bridges called ring canals. A variety of mammalian and other kelch family 
proteins have been identified. The kelch repeat domain is believed to mediate interaction with actin 
(Robinson, D.N. and L. Cooley (1997) J. CeflBiol. 138:799-810). 

IS ADF/cofilms are a family of conserved 15-18 kDa actin-binding protems that play a role in 

cytokinesis, endocytosis, and in development of embryonic tissues, as well as in tissue regeneration 
and in pathologies such as isdiemia, oxidative or osmotic stress. LIM kinase 1 downregulates ADF 
(Carlier, M.R et al (1999) J. BioL Chem. 274:33827-33830). 
Intermediate Filaments and Associated Proteins 

20 Intemiediate filaments (IFs) are cytoskeletal fibers with a diameter of about 10 nm, 

intermediate between that of microfilaments and microtubules. IFs serve structural roles in the cell, 
reinforcing cells and organizing cells into tissues. IPs are particularly abundant in epidermal ceUs and 
in neurons. IPs are extremely stable, and, in contrast to microfilaments and microtubules, do not 
function in ceU motility. 

25 Five types of IF proteins are known in mammak. Type I and Type 11 proteios are the acidic 

and basic keratins, respectively. Heterodimers of the acidic and basic keratins are the building blocks 
of keratin IPs. Keratins are abundant in soft epithelia such as skin and cornea, hard epithelia such as 
nails and hair, and in epithelia that line internal body cavities. Mutations in keratin genes lead to 
epithelial diseases including epidermolysis bullosa simplex, bullous congenital ichthyosiform 

30 erythroderma (epidermolytic hyperkeratosis), non-epidermolytic and epidermolytic palmoplantar 
keratoderma, ichthyosis bullosa of Siemens, pachyonychia congenita, and white sponge nevus. Some 
of these diseases result in severe skin blistering. (See, e.g., Wawersik, M. et aL (1997) L Biol. Chem. 
272:32557-32565; and Corden L.D. and W.H. McLean (1996) Exp. DermatoL 5:297-307.) 
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Type m IF pioteins include desmin, gfial jSbiillary acidic protein, vunentin, and peripherin. 
Desmin filaments in muscle cells Hnk myofibrils into bundles and stabilize sarcomeres in contracting 
muscle. Glial fibrillary acidic protein filaments are found in the glial cells that surround neurons and 
astrocytes. Vimentin filaments are found in blood vessel endothelial cells, some epithelial cells, and 

S mesenchymal cells such as fibroblasts, and are commonly associated with microtabules. Vimentin 
filaments may have roles in keeping the nucleus and other organelles in place in flie cell. Type IV IPs 
include tbe neurofilaments and nestin. Neurofilaments, composed of three polypeptides NF-L, NF-M, 
and NF-H, are fiequently associated with microtubules in axons. Neurofilaments are responsible for 
the radial growth and diameter of an axon, and ultimately for the speed of nerve impulse transmission. 

10 Changes in phosphorylation and metabolism of neurofilaments are observed in neurodegenerative 
diseases including amylotrophic lateral sclerosis, Parldnson's disease, and Alzheimer's disease (Julien, 
JJP. and W.E. Mushynski (1998) Prog. Nucleic Acid Res. Mol. Biol. 61:1-23). Type V IFs, the 
lamins, are found in the nucleus where fhey support the nuclear membrane. 

IPs have a central a-helical rod region interrupted by short nonhelical linker segments. The 

15 rod region is bracketed, in most cases, by non-helical head and tail domains. The rod regions of 
intermediate filament proteins associate to form a coiled-coil dimer. A highly ordered assembly 
process leads from the dimers to the IFs. Neither ATP nor GTP is needed for IF assembly, unlike 
that of microfilaments and microtubules. 

IF-associated proteins (TFAPs) mediate the interactions of IFs with one another and with 

20 other ceU structures. IFAPs cross-link IFs into a bundle, into a network, or to the plasma membrane, 
and may cross-link IPs to the microfilament and microtubule cytoskeleton. Microtubules and IPs are 
particularly closely associated. IFAPs include BPAGl, plakoglobin, desmoplakin I, desmoplakm H, 
plectin, ankyrin, filaggrin, and lamin B receptor. 
Cytoskeletal-Membrane Anchors 

25 Cytoskeletal fibers are attached to the plasma membrane by specific proteins. These 

attachments are important for maintaining cell shape and for muscle contraction. In erythrocytes, the 
spectrin-actin cytoskeleton is attached to the cell membrane by three proteins, band 4.1, ankyrin, and 
adducin. Defects in this attachment result in abnormally shaped cells which are more rapidly 
degraded by the spleen, leading to anemia* In platelets, the spectrin-actin cytoskeleton is also linked to 

30 the membrane by ankyrin; a second actin network is anchored to the membrane by filamin. In muscle 
cells tiie protein dystrophin links actin filaments to the plasma membrane; mutations in the dystrophin 
gene lead to Duchenne muscular dystrophy. 
Focal adhesions 
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Focal adhesions are specialized stroctures in the plasma menibrane involved in tbe adhesion of 
a cell to a substrate, such as the extracellular matrix. Focal adhesions form the connection between 
an extracellular substrate and the cytoskeleton, and affect such functions as cell shape, cell motility 
and cell proliferation. Transmembrane integrin molecules form the basis of focal adhesions. Upon 
5 ligand binding, integrins cluster in the plane of the plasma membrane. Cytoskeletal linker proteins such 
as the actin binding proteins a-actinin, talin, tensin, vinculin, paxillin, and filamin are recruited to the 
clustering site. Key regulatory proteins, such as Rho and Ras family proteins, focal adhesion kmase, 
and Src family members are also recruited. These events lead to the reorganization of actin filaments 
and the formation of stress fibers. These intracellular rearrangements promote further integrin-ECM 

10 interactions and integrin dusteriug. Thus, integrins mediate aggregation of protein complexes on both 
the cytosolic and extracellular faces of the plasma membrane, leading to the assembly of the focal 
adhesion. Many signal transduction responses are mediated via various adhesion complex proteins, 
including Src, FAK, paxiDin, and tensin. (For a review, see Yamada, K.M. and B. Geiger, (1997) 
Curr. Opin. Cell Biol. 9:76-85.) 

15 IFs are also attached to membranes by cytoskeletal-membrane anchors. The nuclear lamina 

is attached to the inner surface of the nuclear membrane by the lamin B receptor. Vimentin IFs are 
attached to the plasma membrane by ankyrin and plectin. Desmosome and hemidesmosome 
membrane junctions hold together epithelial cells of organs and skin. These membrane junctions allow 
shear forces to be distributed across the entire epithelial cell layer, thus providing strength and rigidity 

20 to the epithelium. IFs in epithelial cells are attached to the desmosome by plakoglobin and 

desmoplakins. The proteins that link IFs to hemidesmosomes are not known. Desmin IFs surround 
the sarcomere in muscle and are linked to the plasma membrane by paranemin, synemin, and ankyrin. 
Motor Proteins 
Mvosin-related Motor Proteips 

25 Myosins are actin-activated ATPases, found in eukaryotic cells, that couple hydrolysis of ATP 

with motion. Myosin provides the motor fiinction for muscle contraction and intracellular movements 
such as phagocytosis and rearrangement of cell contents during mitotic cell.division (cytokinesis). The 
contractile unit of skeletal muscle, termed the sarcomere, consists of highly ordered arrays of thin 
actin-containing filaments and thick myosin-containing filaments. Crossbridges form between the thick 

30 and thin filaments, and the ATP-dependent movement of myosin heads within the thick filaments pulls 
the thin filaments, shortening tiie sarcomere and thus the muscle fiber. 

Myosins are composed of one or two heavy chains and associated light chains. Myosin heavy 
chains contain an amino-terminal motor or head domain, a neck that is the site of light-chain binding, 
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and a cafboxy-temiinal tail domain. Hie tail domains may associate to fonn an a-helical coiled coil 
Conventional myosins, such as those found in muscle tissue, are composed of two myosin heavy-chain 
subunits, each associated with two light-chain subunits that bind at the neck region and play a 
regulatory role. Unconventional myosins, believed to function in intracellular motion, may contain 
5 either one or two heavy chams and associated light chains. There is evidence for about 25 myosin 
heavy chain genes in vertebrates, more than half of them unconventional. 
Dvnein-related Motor Proteins 

Dyneins are (-) end-directed motor proteins which act on microtubules. Two classes of 
dyneins, cytosolic and axonemal, have been identified. Cytosolic dyneins are responsible for 

10 translocation of materials along cytoplasmic microtubules, for example, transport from the nerve 
terminal to the cell body and transport of endocytic vesicles to lysosomes. As well, vimses often take 
advantage of cytoplasmic dyneins to be transported to the nucleus and establish a successful infection 
(Sodeik, B. et aL (1997) J. Cell Bio. 136:1007-1021). Virion proteins of herpes simplex viras 1, for 
example, interact wifh the cytoplasmic dynein intermediate chain (Ye, G.J. et aL (2000) J. ViroL 

15 74:1355-1363). Cytoplasmic dyneins are also reported to play a role in mitosis. Axonemal dyneiDs are 
responsible for the beating of flagella and cilia. Dynein on one microtubule doublet walks along the 
adjacent microtubule doublet. This sliding force produces bending that causes the flagelhmi or cilium . 
. to beat Dyneins have a native mass between 1000 and 2000 kDa and contain either two or three . 
force-producing heads driven by the hydrolysis of ATP. The heads are linked via stalks to a basal 

20 domain v^hich is composed of a highly variable number of accessory intermediate and light chains. 
Cytoplasmic dynein is the largest and most complex of the motor proteins. 
Kinesin-related Motor Proteins 

Kinesins are (+) end-directed motor proteins which act on microtubules. The prototypical 
kinesin molecule is involved in the transport of membrane-bound vesicles and organelles. This 

25 function is particularly important for axonal transport in neurons. Kinesin is also important in all cell 
types for the transport of vesicles from the Golgi complex to the endoplasmic reticulum. This role is 
critical for maintaining the identity and functionality of these secretory organelles. 

Kinesins define a ubiquitous, conserved family of over 50 proteins that can be classified into at 
least 8 subfamilies based on primary amino acid sequence, domain structure, velocity of movement, 

30 and cellular function. (Reviewed in Moore, J.D. and S.A. Endow (1996) Bioessays 18:207-219; and 
Hoyt, A.M. (1994) Curr. Opin. Cell BioL 6:63-68.) The prototypical kmesin molecule is a 
heterotetramer comprised of two heavy polypeptide chains (KHCs) and two light polypeptide chains 
(KLCs). The KHC subunits are typically referred to as 'Idnesin." KHC is about 1000 amino acids in 
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length, and KLC is about 550 amino acids in length. Two KHCs dimerize to fonn a rod-shaped 
molecule with three distinct regions of secondary structure. At one end of the molecule is a globular 
motor domain that functions in ATP hydrolysis and microtubule binding. Kinesin motor domains are 
highly conserved and share over 70% identity. Beyond the motor domain is an a-helical coiled-coil 
5 region which mediates dimerizadon. At the other end of the molecule is a fan-shaped tail that 

associates with molecular cargo. The tail is formed by the interaction of the KHC C-termini with the 
twoKLCs. 

Members of the more divergent subfamilies of Idnesins are called Idnesin-related proteins 
(KRPs),many of wUch function during mitosis in eukaryotes (Hovt. supra\ Some KRPs are 

10 required for assembly of the mitotic spindle. In vivo and in vitro anaJ^es suggest that these KRPs 
exert force on microtubules that comprise the mitotic spindle, resulting in the separation of spindle 
poles. FhosphorylationofKKP is required for this activity. Failure to assemble the mitotic spindle 
results in abortive mitosis and chromosomal aneuploidy, the latter condition being characteristic of 
cancer ceQs. In addition, a unique KRP, centromere protein E, localizes to tiie kinetochore of human 

15 mitotic chromosomes and may play a role in tiieir segregation to opposite spindle poles. 
Dvnamin-related Motor Proteins 

Dynamin is a large GTPase motor protein that functions as a ^'molecular pinchase," generating 
a mechanochemical force used to sever membranes. This activity is important in forming clathrin- 
coated vesicles from coated pits in endocytosis and in the biogenesis of synaptic vesicles in neurons. 

20 Binding of dynamin to a membrane leads to dynanun's self-assembly into spirals that may act to 

constrict a flat membrane surface into a tubule. GTP hydrolysis induces a change in conformation of 
the dynamin polymer that piaches the membrane tubule, leading to severing of the membrane tubule 
and formation of a membrane vesicle. Release of GDP and inorganic phosphate leads to dynanoin 
disassembly. Following disassembly the dynamin may either dissociate from the membrane or remain 

25 associated to the vesicle and be transported to another region of the cell. Three homologous dynamin 
genes have been discovered, in addition to several dynamin-related proteins. Conserved dynamin 
regions are the N-terminal GTP-biading domain, a central pleckstrin homology domain that binds 
membranes, a central coiled-coU region that may activate dynamin' s GTPase activity, and a C- 
terminal proline-rich domain that contains several motifs that bind SH3 domains on o&er proteins. 

30 Some dynamin-related proteins do not contain the pleckstrin homology domain or flie proline-rich 
domain. (See McNiven, M.A. (1998) CeU 94:151-154; Scaife, KM. and R.L. Margolis (1997) CeE 
Signal 9:395-401.) 

The cytoskeleton is reviewed in Lodish, H. et aL (1995) Molecular Cell Biology. Scientific 
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American Books, New Yoik NY. 

The discovexy of new cytoskeleton-associated proteins, and the polynucleotides encoding 
fliem, satisfies a need in the art by providing new compositions which are useful in the diagnosis, 
prevention, and treatment of cell proliferative disorders, viral infections, and neurological disorders, and 
5 in the assessment of the effects of exogenous compounds on the expression of nucleic acid and amino 
acid sequences of cytoskeleton-associated proteins. 

SUMMARY OF THE INVENTION 
The invention features purified polypeptides, cytoskeleton-associated proteins, referred to 

10 collectively as "CSAF' and individuafly as "CSAP-1," "CSAP-2," ''CSAP-S," "CSAP-4," "CSAP-5," 
"CSAP-6," "CSAP-7," "CSAP-8," "CSAP-9," "CSAP-10," "CSAP-1 1," "CSAP-12," "CSAP-13," 
'X:SAP-14," "CSAP-15," 'VSAPAer XSAP-17," and ••CSAP-18." In one aspect, the invention 
provides an isolated polypeptide selected from the group consisting of a) a polypeptide comprising an 
amino acid sequence selected from the group consisting of SEQ ID N0:l-18, b) a polypeptide 

15 comprising a naturally occurring ammo acid sequence at least 90% identical to an amino acid 

sequence selected from the group consisting of SEQ ID N0:l-18, c) a biologically active fragment of 
a polypeptide having an amino acid sequence selected from the group consisting of SEQ ID N0:l-18, 
and d) an immunogenic fragment of a polypeptide having an amino acid sequence selected from the 
group consisting of SEQ ID NO:l-18. In one alternative, the invention provides an isolated 

20 polypeptide comprising the amino acid sequence of SEQ ID N0:l-18. 

The invention further provides an isolated polynucleotide encoding a polypeptide selected from 
the group consisting of a) a polypeptide comprising an amino acid sequence selected from the group 
consisting of SEQ ID NO:l-18, b) a polypeptide comprising a naturally occurring amino acid sequence 
at least 90% identical to an amino acid sequence selected from the group consisting of SEQ ID N0:1- 

25 18, c) a biologically active fragment of a polypeptide having an amino acid sequence selected from the 
group consisting of SEQ ID NO: 1-1 8, and d) an immunogenic fragment of a polypeptide having an 
amino acid sequence selected from the group consisting of SEQ ID NO:l-18. In one alternative, the 
polynucleotide encodes a polypeptide selected from the group consisting of SEQ ID N0:l-18. In 
another alternative, the polynucleotide is selected from the group consisting of SEQ ID NO:19-36. 

30 Additionally, the invention provides a recombinant polynucleotide comprising a promoter 

sequence operably Unked to a polynucleotide encoding a polypeptide selected from the group 
consisting of a) a polypeptide comprising an amino add sequence selected from the group consisting 
of SEQ ID N0:l<18, b) a polypeptide comprising a naturally occurring amino acid sequence at least 

9 
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90% identical to an amino acid sequence selected firom the group consisting of SEQ ID N0:l-18, c) a 
biologically active fragment of a polypeptide having an amino acid sequence selected from the group 
consisting of SEQ ID N0:l-18, and d) an immunogenic fragment of a polypeptide having an amino 
acid sequence selected from the group consisting of SEQ ID NO:l-18. In one alternative, the 
S invention provides a cell transfomied with the recombinant polynucleotide. In another alternative, the 
invention provides a transgenic organism comprising the recombinant polynucleotide. 

The invention also provides a method for producing a polypeptide selected from the group 
consisting of a) a polypeptide comprising an amino acid sequence selected from the group consisting 
of SEQ ID N0:l-18, b) a polypeptide comprismg a naturally occurring amino acid sequence at least 

10 90% identical to an ammo add sequence selected from the group consisting of SEQ ID N0:l"18, c) a 
biologically active fragment of a polypeptide having an amino acid sequence selected from the group 
consisting of SEQ ID NO:l-18, and d) an immunogenic fragment of a polypeptide having an amino 
acid sequence selected from the group consisting of SEQ ID NO:l-18. The method comprises a) 
culturing a cell under conditions suitable for expression of the polypeptide, wherein said cell is 

15 transformed with a recombinant polynucleotide comprising a promoter sequence operably linked to a 
polynucleotide encoding the polypeptide, and b) recovering thei polypeptide so eiqpressed. 

Additionally, the invention provides an isolated antibody which specifically bmds to a 
polypeptide selected fi^m the group consisting of a) a polypeptide comprising an amino acid sequence 
selected from the group consisting of SEQ ID N0:l-18, b) a polypeptide comprising a naturally 

20 occurring amino acid sequence at least 90% identical to an amino acid sequence selected from the 
group consisting of SEQ ID N0:l-1 8, c) a biologically active fragment of a polypeptide having an 
amino acid sequence selected from the group consisting of SEQ ID N0:l-18, and d) an immunogenic 
fragment of a polypeptide having an amino acid sequence selected from the group consisting of SEQ 
IDN0:M8. 

25 The invention further provides an isolated polynucleotide selected from the group consisting of 

a) a polynucleotide comprising a polynucleotide sequence selected from the group consisting of SEQ 
ID NO:19-36, b) a polynucleotide comprising a naturally occurring polynucleotide sequence at least 
90% identical to a polynucleotide sequence selected from the group consisting of SEQ ID NO:19-36, 
c) a polynucleotide complementary to the polynucleotide of a), d) a polynucleotide complementary to 

30 the polynucleotide of b), and e) an RNA equivalent of a)-d). In one alternative, the polynucleotide 
comprises at least 60 contiguous nucleotides. 

Additionally, the invention provides a method for detecting a target polynucleotide in a sample, 
said target polynucleotide having a sequence of a polynucleotide selected from the group consisting of 
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a) a polynucleotide comprising a polynucleotide sequence selected from the group consisting of SEQ 
ID NO:19-36, b) a polynucleotide comprising a naturally occurring polynucleotide sequence at least 
90% identical to a polynucleotide sequence selected bom the group consisting of SBQ ID NO:19-36» 
c) a polynucleotide complementary to the polynacleotide of a), d) a polynucleotide complementary to 

5 the polynucleotide of b), and e) an RNA equivalent of a)-d). The method comprises a) hybridizing the 
sample with a probe comprising at least 20 contiguous nucleotides comprising a sequence 
complementary to said target polynucleotide in the sample, and which probe specifically hybridizes to 
said target polynucleotide, under conditions whereby a hybridization complex is formed between said 
probe and said target polynucleotide or fragments thereof, and b) detecting the presence or absence of 

10 said hybridization complex, and optionally, if present, the amount thereof. In one alternative, the probe 
comprises at least 60 contigaous nucleotides. 

The invention further provides a method for detecting a target pol}aiucleotide in a sample, said 
target polynucleotide having a sequence of a polynucleotide selected from the group consisting of a) a 
polynucleotide comprising a polynucleotide sequence selected from the group consisting of SEQ ID 

15 NO:19-36, b) a polynucleotide comprising a naturally occurring polynucleotide sequence at least 90% 
identical to^ a polynucleotide sequence selected from the group consisting of SEQ ID NO: 19-36, c) a 
polynucleotide complementary to the polynucleotide of a), d) a polynucleotide complementary to the 
polynucleotide of b), and e) an RNA equivalent of a)-d). The method comprises a) amplifying said 
target polynucleotide or jfragment thereof using polymerase chain reaction amplification, and b) 

20 detecting the presence or absence of said amplified target polynucleotide or fragment thereof, and, 
optionally, if present, the amount thereof. 

The invention further provides a composition comprising an effective amount of a polypeptide 
selected from the group consisting of a) a polypeptide comprising an amino acid sequence selected 
from the group consisting of SEQ ID NO:l-l 8, b) a polypeptide comprising a naturally occurring 

25 amino acid sequence at least 90% identical to an amino acid sequence selected fi-om the group 

consisting of SEQ ID NO:l-18, c) a biologically active fragment of a polypeptide having an amino acid 
sequence selected from the group consisting of SEQ ID N0:l-18, and d) an immunogenic fragment of 
a polypeptide having an amino acid sequence selected from the group consisting of SEQ ID NO: 1-1 8, 
and a pharmaceutically acceptable excipient In one embodiment, the composition comprises an amino 

30 acid sequence selected from the group consisting of SEQ ID N0:l-18. The invention additionally 
provides a method of treating a disease or condition associated with decreased expression of 
functional CSAP, comprising administering to a patient in need of such treatment the composition. 
The invention ako provides a method for screemng a compound for effectiveness as an 
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agonist of a polypeptide selected from the group consisting of a) a polypeptide comprising an amino 
acid sequence selected from the group consisting of SEQ ID NO:l-18, b) a polypeptide comprising a 
naturally occurring amino acid sequence at least 90% identical to an amino acid sequence selected 
from the group consisting of SEQ ID NO:l-18, c) a biologically active fragment of a polypeptide 

5 having an amino acid sequence selected from the group consisting of SEQ ID N0:I-18, and d) an 
immunogenic fragment of a polypeptide having an amino acid sequence selected from the group 
consisting of SEQ ID N0:l-18. The method comprises a) exposing a sample comprising the 
polypeptide to a compound^ and b) detecting agonist activity in the sample. In one alternative, the 
invention provides a composition comprising an agonist compound identified by the method and a 

10 pharmaceutically acceptable excipient In another alternative, the invention provides a method of 
treating a disease or condition associated with decreased expression of functional CS AP, comprising 
administering to a patient in need of such treatment the compositiorL 

Additionally, the invention provides a method for screening a compound for effectiveness as 
an antagonist of a polypeptide selected from the group consisting of a) a polypeptide coiiq>rising an 

15 amino acid sequence selected from the group consisting of SEQ ID NO: 1-1 8, b) a polypeptide 
comprising a naturally occurring amino acid sequence at least 90% identical to an amino acid 
sequence selected from tiie group consisting of SEQ ID NO: 1-1 8, c) a biologically active fragment of 
a polypeptide having an amino acid sequence selected from the group consisting of SEQ ID NO:l-18, 
and d) an immunogenic fragment of a polypeptide having an amino acid sequence selected from the 

20 group consisting of SEQ ID N0:l-18. The method comprises a) exposiog a sample comprising the 
polypeptide to a compound, and b) detecting antagonist activity in the sample. In one alternative, the 
invention provides a composition comprising an antagonist compound identified by the method and a 
pharmaceutically acceptable excipient In another alternative, the invention provides a method of 
treating a disease or condition associated with overexpression of fimctional CSAP, comprising 

25 administering to a patient in need of such treatment the composition. 

The invention further provides a method of screening for a compound that specifically binds to 
a polypeptide selected fit)m the group consisting of a) a polypeptide comprising an amino acid 
sequence selected from the group consisting of SEQ ID N0:l-18, b) a polypeptide comprising a 
naturally occurring amino acid sequence at least 90% identical to an amino acid sequence selected 

30 from the group consisting of SEQ ID N0:l-18, c) a biologically active fragment of a polypeptide 
having an amino acid sequence selected from the group coiDisisting of SEQ ID N0:l-18, and d) an 
immunogenic fragment of a polypeptide having an amino acid sequence selected from the group 
consisting of SEQ ID NO: 1-18. The method comprises a) combinmg the polypeptide with at least one 
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test coDopound under suitable conditioiis, and b) detecting binding of the polypeptide to the test 
compound, thereby identifying a compound that specifically binds to the polypeptide. 

The invention further provides a method of scxeening for a compoimd that modulates the 
activity of a polypeptide selected bom the group consisting of a) a polypeptide comprising an amino 
5 acid sequence selected from the group consisting of SEQ ID N0:l-18, b) a polypeptide comprising a 
naturally occurring amino add sequence at least 90% identical to an amiAo acid sequence selected 
from the group consisting of SEQ ID NO: 1-1 8, c) a biologically active fragment of a polypeptide 
having an amino acid sequence selected from the group consisting of SEQ ID N0:l-18, and d) an 
immunogenic fragment of a polypeptide having an amino acid sequence selected from the group 

10 consisting of SEQ ID NO:l-18. The method comprises a) combrning the polypeptide y^ith at least one 
test compound under conditions permissive for the activity of the polypeptide, b) assessing the activity 
of the polypeptide in the presence of the test compound, and c) comparing the activity of the 
polypeptide in the presence of the test compound with the activity of the polypeptide in the absence of 
the test compound, wherein a change in the activity of the polypeptide in. die presence of the test 

15 compoundisindicativeof a compound that modulates the activity of the polypeptide. 

The mvention frirther provides a method for screening a compound for effectiveness in 
altering expression of a target polynucleotide, wherein said target polynucleotide comprises a 
polynucleotide sequence selected from the group consisting of SEQ ID NO:19-36, the method 
comprising a) exposing a sample comprising the target polynucleotide to a compound, b) detecting 

20 altered expression of the target polynucleotide, and c) comparing the expression of the target 
polynucleotide in the presence of varying amounts of the compound and in the absence of the 
compound. 

The invention further provides a method for assessing toxicity of a test compound, said 
method comprising a) treating a biological sample containing nucleic acids with the test compound; b) 

25 hybridizing the nucleic acids of the treated biological sample with a probe comprising at least 20 
contiguous nucleotides of a polynucleotide selected from the group consisting of i) a polynucleotide 
comprising a polynucleotide sequence selected from the group consisting of SEQ ID NO:19-36, ii) a 
polynucleotide comprising a naturally occurring polynucleotide sequence at least 90% identical to a 
polynucleotide sequence selected from the group consisting of SEQ ID NO:19-36, iii) a polynucleotide 

30 having a sequence complementary to i), iv) a polynucleotide complementary to the polynucleotide of 
ii), and v) an RNA equivalent of iHv). Hybridization occurs under conditions whereby a specific 
hybridization complex is formed between said probe and a target polynucleotide in the biologpical 
sample, said target polynucleotide selected from the group consisting of i) a polynucleotide comprising 
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a polynucleotide sequence selected from die group consisting of SEQ ID NO: 19-3 6, ii) a 
polynucleotide comprising a naturally occurring polynucleotide sequence at least 90% identical to a 
polynucleotide sequence selected from the group consisting of SBQ ID NO: 19-36, iii) a polynucleotide 
complementary to the polynucleotide of i), iv) a polynucleotide complementary to the polynucleotide of 

5 ii), and v) an RNA equivalent of i)-iv). Alternatively, the target polynucleotide comprises a fragment 
of a polynucleotide sequence selected from the group consisting of i)-v) above; c) quantifying flie 
amount of hybridization complex; and d) comparing the amount of hybridization complex in the treated 
biological sample with the amount of hybridization complex in an untreated biological sample, wherein 
a difference in the amount of hybridization complex in Ihe treated biological sample is indicative of 

10 toxicity of the test compound. 

BRIEF DESCRIPTION OF THE TABLES 
Table 1 summarizes the nomenclature for the full lengfh polynucleotide and polypeptide 
sequences of the present invention. 
15 Table 2 shows the GenBank identification nunober and annotation of the nearest GenBank 

homolog for polypeptides of the invention. The probability scores for the matches between each 
polypeptide and its homolog(s) are also shown. 

Table 3 shows structural features of polypeptide sequences of the invention, including 
predicted motifs and domains, along with the methods, algorithnas, and searchable databases used for 
20 analysis of the polypeptides. 

Table 4 lists the cDNA and/or genomic DNA fragments which were used to assemble 
polynucleotide sequences of the invention, along with selected fragments of the polynucleotide 
sequences. 

Table 5 shows the representative cDNA library for polynucleotides of the invention. 
25 Table 6 provides an appendix which describes the tissues and vectors used for construction of 

the cDNA libraries shown in Table 5. 

Table 7 shows the tools, programs, and algorithms used to analyze the polynucleotides and 
polypeptides of the invention, along with applicable descriptions, references, and threshold parameters. 

30 DESCRIPTION OF THE INVENTION 

Before the present proteins, nucleotide sequences, and methods are described, it is understood 
that this invention is not limited to Qie particular machines, materials and methods described, as these 
may vary. It is also to be understood that the terminology used herein is for the purpose of describiog 
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particular embodimeBts only, and is not intended to limit the scope of the present invention which win 
be Ihnited only by the appended clauns. 

It must be noted that as used herein and in the appended claims, the singular forms "a," *'an/' 
and *^e*' inclnde plural reference unless the context clearly dictates otherwise. Thus, for example, a 
5 reference to "a host cell'' includes a plurality of such host cells, and a reference to "an antibody" is a 
reference to one or more antibodies and equivalents thereof known to tiiose skilled in flie art, and so 
forth. 

Unless defined otherwise, all technical and scientific terms used herein have the same 
meanings as commonly understood by one of ordinaiy skill in the art to which this invention belongs. 

10 Although any machines, materials, and methods similar or equivalent to those described herein can be 
used to practice or test the present invention, the preferred machines, materials and methods are now 
described. All publications mentioned herein are cited for the purpose of describing and disclosing the 
cell lines, protocols, reagents and vectors which are reported in the publications and which might be 
used in coimection with the invention. Nothing herein is to be constraed as an admission that the 

15 invention is not entitled to antedate such disclosure by virtue of prior invention. 
DEFINITIONS 

**CSAF' refers to the amino acid sequences of substantially purified CSAP obtained jfrom any 
species, particularly a mammalian species, including bovine, ovine, porcine, murine, equine, and human 
and fi*om any source, whether natural, synthetic, semi-synthetic, or recombinant 

20 The term "agonist" refers to a molecule which intensifies or mimics the biological activity of 

CSAP. Agonists may include proteins, nucleic acids, carbohydrates, small molecules, or any other 
compound or composition which modulates the activity of CSAP either by direcfly interacting with 
CSAP or by acting on components of the biological pathway in which CSAP participates. 

An "allelic variant" is an alternative form of the gene encoding CSAP. Allelic variants may . 

25 result from at least one mutation in the nucleic acid sequence and may result in altered mRNAs or in 
polypeptides whose structure or function may or may not be altered. A gene may have none, one, or 
many allelic variants of its naturally occurring form. Common mutational changes which give rise to 
allelic variants are generally ascribed to natural deletions, additions, or substitutions of nucleotides. 
Each of these types of changes may occur alone, or in combination with the others, one or more times 

30 in a given sequence. 

"Altered*' nucleic acid sequences encoding CSAP include those sequences with deletions, 
insertions, or substitutions of different nucleotides, resulting in a polypeptide the same as CSAP or a 
polypeptide with at least one fiinctional characteristic of CSAP. Included within this definition are 
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polymorphisms which may or may not be readily detectable using a particular oligonucleotide probe of 
the polynucleotide encoding CSAP, and improper or unexpected hybridization to allelic variants, with a 
locus other than the normal chromosomal locus for the polynucleotide sequence encoding CSAP. The 
encoded protein may also be "altered/' and may contain deletions, insertions, or substitutions of amino 
5 acid residues which produce a silent change and result in a functionally equivalent CSAP. Deliberate 
amino acid substitutions may be made on the basis of similarity in polarity, charge, solubility, 
hydrophobicity, hydrophilicity, and/or the amphipathic nature of the residues, as long as the biological 
or immunological activity of CSAP is retained. For example, negatively charged amino acids may 
include aspartic acid and glutamic acid, and positively charged amino acids may include lysine and 

10 arginine. Amino acids with uncharged polar side chains having similar hydrophilidty values naay 
include: asparagine and gihitamine; and serine and threonine. Amino acids with uncharged side chains 
having similar hydrophilicity values may include: leucine, isoleucine, and valine; glycine and alanine; 
and phenylalanine and tyrosine. 

Tlie terms "amino acid*' and "amino acid sequence" refer to an oligopeptide, peptide, 

15 polypeptide, or protein sequence, or a fragment of any of these, and to naturally occurring or synthetic 
molecules. Where "amino acid sequence" is recited to refer to a sequence of a naturally occurring 
protein molecule, "amino acid sequence" and like terms are not meant to limit the amino acid sequence 
to the complete native amino acid sequence associated with the recited protein molecule. 
» "Amplification" relates to the production of additional copies of a nucleic acid sequence. 

20 Amplification is generally carried out usiug polymerase chain reaction (PGR) technologies well known 
in the art. 

The term "antagonist" refers to a molecule which iohibits or attenuates the biological activity 
of CSAP. Antagonists may include proteins such as antibodies, nucleic acids, carbohydrates, small 
molecules, or any other compound or composition which modulates the activity of CSAP either by 
25 directly interacting with CSAP or by acting on components of the biological pathway in which CSAP 
participates. 

The term "antibody" refers to intact immunoglobulin molecules as well as to fragments 
thereof, such as Fab, F(ab')2, and Fv fragments, which are capable of binding an epitopic determinant 
Antibodies that bind CSAP polypeptides can be prepared using intact polypeptides or using fragments 
30 containing small peptides of interest as the immunizing antigeit The polypeptide or oligopeptide used 
to iomiunize an animal (e.g., a mouse, a rat, or a rabbit) can be derived from the translation of RNA, 
or synthesized chemically, and can be conjugated to a carrier protein if desired. Commonly used 
carriers that are chemically coupled to peptides include bovine serum albumin, thyroglobulin, and 
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kejiiole limpet hemocyaiiin (KLH). The coupled peptide is flien used to immunize the aoimaL 

The term ^'antigenic deteiminant" lefers to that region of a molecule (i.e.» an epitope) that 
makes contact with a particular antibody. When a protein or a fragment of a protein is used to 
iinmunize a host animal, numerous regions of the protein may induce tibie production of antibodies 
5 which bind specifically to antigenic determinants (particular regions or three-dimensional structures on 
the protein). An antigenic determinant may compete with the intact antigen (i.e., the immunogen used 
to elicit the immune response) for binding to an antibody. * 

The term "aptamer" refers to a nucleic add or oligonucleotide molecule that binds to a 
specific molecular target. Aptamers are derived fi-om an in vitro evolutionary process (e.g., SELEX 

10 (Systematic Evolution of Ligands by Exponential Enrichment), described in U.S. Patent No. 

5,270,163), which selects for target-specific aptamer sequences firom large combinatorial libraries. 
Aptamer conqpositions may be double-stranded or single-stranded, and may include 
deoxyribonucleotides, ribonucleotides, nucleotide derivatives, or other nucleotide-like molecules. The 
nucleotide components of an aptamer may have modified sugar groups (e.g., the 2-OH group of a 

15 ribonucleotide may be replaced by 2'-F or 2 -NHy, which may improve a desired property, e.g., 
resistance to nucleases or longer lifetime in blood. Aptamers may be conjugated to other molecules, 
e.g., a high molecular weight carrier to slow clearance of the aptamer fix)m the circulatory system. 
Aptamers may be specifically cross-linked to their cognate ligands, e.g., by photo-activation of a 
cross-linker. (See, e.g., Brody, E.N. and L. Gold (2000) J. Biotechnol. 74:5-13.) 

20 The term "iutramer" refers to an aptamer which is expressed in vivo . For example, a vaccinia 

vims-based *RNA expression system has been used to express specific RNA aptamers at high levek 
in the cytoplasm of leukocytes (Blind, M. et aL (1999) Proc. Natl Acad. Sci. USA 96:3606-3610). 

The term "spiegehner" refers to an aptamer which includes L-DNA, L-RNA, or other left- 
handed nucleotide derivatives or nucleotide-like molecules. Aptamers containing left-handed 

25 nucleotides are resistant to degradation by naturally occurring enzymes, which normally act on 
substrates containing right-handed nucleotides. 

The term "antisense" refers to any composition capable of base-pairing with flie "sense" 
(coding) strand of a specific nucleic acid sequence. Antisense compositions may include DNA; RNA; 
peptide nucleic acid (PNA); oligonucleotides having modified backbone linkages such as 

30 phosphorotfaioates, methylphosphonates, or benzylphosphonates; oligonucleotides having modified 
sugar groups such as 2 -methoxyethyl sugars or 2 -methoxyethoxy sugars; or oligonucleotides having 
modified bases such as 5-Biethyl cytosine, 2*-^eoxyaracil, or 7-deaza-2*-deoxyguanosine. Antisense 
molecules may be produced by any method including cheniical synthesis or transcription. Once 
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introduced into a cell, the complementary antisense molecule base-pairs with a naturally occurring 
nucleic acid sequence produced by the cell to form duplexes which block either transcription or 
translation. The designation ''negative" or '^minns'' can refer to the antisense strand, and the 
designation '^positive" or "plus*" can refer to the sense strand of a reference DNA molecule. 
5 The term '1>iologically active" refers to a protein having stmctural, regulatory, or biochemical 

functions of a naturally occurring molecule. Likewise, "'immunologically active" or ''immunogenic" 
refers to the capability of the natural, recombinant, or synthetic CSAP, or of any oligopeptide thereof, 
to induce a specific immune response in appropriate animals or cells and to bind with specific 
antibodies. 

10 "Complementary" describes the relationship between two single-stranded nucleic acid 

sequences that anneal by base-pairing. For example, 5 -AGT-3* pairs with its complement, 
3'-TCA-5'. 

A "composition comprising a given polynucleotide sequence" and a "composition comprising a 
given amino acid sequence" refer broadly to any composition containing the given polynucleotide or 

15 amino acid sequence. The composition may comprise a dry formulation or an aqueous solution. 
Compositions comprising poljmucleotide sequences encoding CSAP or fragments of CSAP maybe 
employed as hybridization probes. Hie probes may be stored fax freeze-dried form and maybe 
associated with a stabilizing agent such as a carbohydrate. In hybridizations, the probe may be 
deployed in an aqueous solution containing salts (e.g., NaCl), detergents (e.g., sodium dodecyl sulfate; 

20 SDS), and other components (e.g., Denhardt's solution, dry milk, salmon sperm DNA, etc.). 

"Consensus sequence" refers to a nucleic acid sequence which has been subjected to 
repeated DNA sequence analysis to resolve uncalled bases, extended using the XL-PCR kit (Applied 
Bios)retems, Foster City CA) in the 5' and/or the 3' direction, and resequenced, or which has been 
assembled from one or more overlapping cDNA, EST, or genomic DNA fragments using a computer 

25 program for fragment assembly, such as the GELVEEW fragment assembly system (GCG, Madison 
WI) or Ehrap (University of Washington, Seattle WA). Some sequences have been both extended 
and assembled to produce the consensus sequence. 

"Conservative amino acid substitutions" are those substitutions that are predicted to least 
interfere with the properties of the original protein, i.e., the structure and especially the function of the 

30 protein is conserved and not significantly changed by such substitutions. The table below shows amino 
acids which may be substituted for an original amino acid in a protein and which are regarded as 
conservative amino acid substitutions. 
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Origbal Residue 


Conservative Substitution 




Ala 


Gly, Ser 




Arg 


His, Lys 




Asn 


Asp, Gin, His 


5 


Asp 


Asn, Ghi 




Cys 


Ala. Ser 




Gin 


Asn, Glu, IBs 




Gla 


Asp, Gin, His 




Gly 


Ala 


10 


His 


Asn, Arg, Gin, Glu 




He 


Leu, Val 




Leu 


He, Val 




Lys 


Arg, Gin, Ghi 




Met 


Leu, lie 


15 


Phe 


His, Met, Leu, Trp, Tyr 




Ser 


Cys, Thr 




Thr 


Ser, Val 




Tip 


Phe, Tyr 




Tyr 


His, Phe, Trp 


20 


Val 


He, Leu, Thr 



Conservative amino acid substitations generally mamtain (a) the structure of the polypeptide 
backbone in the area of the substitution, for example, as a beta sheet or alpha helical conformation, 
(b) the charge or hydrophobicity of the molecule at the site of the substitution, and/or (c) the bulk of 
2S the side chain. 

A "deletion" refers to a change in the amino acid or nucleotide sequence that results in the 
absence of one or more amino acid residues or nucleotides. 

Ite term "derivative" refers to a chemically modified polynucleotide or polypeptide. 
Chemical modifications of a polynucleotide can include, for example, replacement of hydrogen by an 
30 alkyl, acyl, hydroxyl, or amino group. A derivative polynucleotide encodes a polypeptide which retains 
at least one biological or immunological function of the natural molecule. A derivative polypeptide is 
one modified by glycosylation, pegylation, or any similar process that retains at least one biological or 
immunological function of the polypeptide from which it was derived. 

A "detectable laber refers to a reporter molecule or enzyme that is capable of generating a 
35 measurable signal and is covalejtitly or noncovalentiy joined to a polynucleotide or polypeptide. 

'T)ifferential expression" refers to increased or upregulated; or decreased, downregulated, or 
absent gene or protein e;q>ression, determined by comparing at least two different samples. Such 
comparisons may be carried out between, for example, a treated and an untreated sample, or a 
diseased and a nomial sample. 
40 "Exon shuffling" refers to the recombination of different coding regions (exons). Since an 
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exon may represent a structural or fiuictioBal domain of the encoded protein, new proteins may be 
assembled through flie novel reassortment of stable substructures, thus allowing acceleration of the 
evolution of new protein fimcdons. 

A 'fragmenf ' is a unique portion of CSAP or the polynucleotide encoding CS AP which is 

5 identical in sequence to but shorter in lengdi than the parent sequence. A fragment may comprise up 
to the entire length of the defined sequence, minus one nucleotide/amino acid residue. For example, a 
fragment may comprise from S to 1000 contiguous nucleotides or amino add residues. A fragment 
used as a probe, primer, antigen, therapeutic molecule, or for other purposes, may be at least 5, 10, 15, 
16, 20, 25, 30, 40, SO, 60, 75, 100, 150, 250 or at least 500 contiguous nucleotides or amino acid 

10 residues in length. Fragments may be preferentially selected from certain regions of a molecule. For 
example, a polypeptide fragment may comprise a certain length of contiguous amino acids selected 
from tbe first 250 or 500 amino acids (or first 25% or 50%) of a polypeptide as shown in a certain 
defined sequence. Clearly these lengths are exenq)laty, and any length that is supported by the 
specification, including the Sequence Listing, tables, and figures, may be encompassed by the present 

IS embodiments. 

A fragment of SEQ ID NO: 19-36 comprises a region of unique polynucleotide sequence that 
specifically identifies SEQ ED NO:19-36, for example, as distinct from any other sequence in the 
genome from which the fragment was obtained. A fragment of SBQ ID NO: 19-36 is useful, for 
example, in hybridization and amplification technologies and in analogous methods that distinguish SEQ 

20 ID NO: 19-36 from related polynucleotide sequences. The precise length of a fragment of SEQ ID 
NO:19-36 and the region of SEQ ID NO:19-36 to which the fragment corresponds are routinely 
determiuable by one of ordinary skill in the art based on the intended purpose for the fragment. 

A fragment of SEQ ID N0:l-18 is encoded by a fragment of SEQ ID NO:19-36. A 
fragment of SEQ ID NO: 1-1 8 comprises a region of unique amino acid sequence that specifically 

25 identifies SEQ ID NO:M8. For example, a fragment of SEQ ID NO:l-18 is useful as an 

immunogenic peptide for the development of antibodies that specifically recognize SEQ ID N0:l-18. 
The precise length of a fragment of SEQ ID N0:l-18 and the region of SEQ ID N0:l-18 to which 
the fragment corresponds are routinely detemoinable by one of ordinaty sldQ in the art based on the 
intended purpose for the fragment 

30 A *fiill length" polynucleotide sequence is one containing at least a translation initiation codon 

(e.g., methionine) followed by an open reading frame and a translation tennination codon. A "lull 
lengtii" polynucleotide sequence encodes a *full length*' polypeptide sequence. 

**Homolog/' refers to sequence similarity or, interchangeably, sequence identity, between two 
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or more polynucleotide sequences or two or more polypeptide sequences. 

The terms '*percent identity" and "% identity^" as applied to polynucleotide sequences, refer to 
the percentage of residue matches between at least two polynucleotide sequences aligned using a 
standardized algorithm. Such an algorithm may insert, in a standardized and reproducible way, gaps in 
5 the sequences being compared in order to optimize aligoment between two sequences, and therefore 
achieve a more meaningful comparison of the two sequences. 

Percent identity between polynucleotide sequences may be determined using the default 
parameters of the CLUSTAL V algorithm as incorporated into the MEGALIGN version 3.12e 
sequence alignment program. This program is part of the LASERGENE software package, a suite of 
10 molecular biological analysis programs (DNASTAR, Madison WI). CLUSTAL V is described in 
IBggms, D.G. and P.M. Sharp (1989) CABIOS 5:151-153 and in Higgms, D.G. et al. (1992) CABIOS 
8:189-191. For pairwise alignments of polynucleotide sequences, the default parameters are set as 
follows: Ktuple=2, gap penal^S, window=4, and "diagonals saved"=4. The '^veighted" residue 
weight table is selected as the default Percent identity is reported by CLUSTAL V as the •*percent 
15 similarit/' between aligned polynucleotide sequences. 

Alternatively, a suite of commonly used and freely available sequence comparison algorithms 
is provided by the National Center for Biotechnology Information (NCBI) Basic Local Alignment 
Search Tool OBLAST) (Altschul, S.F. et aL (1990) J. Mol. Biol. 215:403-410), which is available from 
several sources, including the NCBI, Bethesda, MD, and on the Internet at 
20 http://www.ncbLnlmLnih.gov/BLAST/. The BLAST software suite includes various sequence analysis 
programs including "'blastn," that is used to align a known polynucleotide sequence with other 
polynucleotide sequences from a variety of databases. Also avsdlable is a tool called ''BLAST 2 
Sequences" that is used for direct pairwise comparison of two nucleotide sequences. ''BLAST 2 
Sequences" can be accessed and used interactively at http://www.ncbi.nlm.nih.gov/gorf/bl2.htmL The 
25 'TBLAST 2 Sequences" tool can be used for both blastn and blastp (discussed below). BLAST 

programs are commonly used with gap and other parameters set to default settings. For example, to 
compare two nucleotide sequences, one may use blastn with the **BLAST 2 Sequences" tool Version 
2.0.12 (April'-21-2000) set at default parameters. Such default parameters maybe, for example: 

Matrix: BLOSUM62 
30 Reward for match: 1 

Penalty for mistnatch: -2 

Open Gap: 5 and Extension Gap: 2 penalties 

Gap X drop-off: 50 
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Expect: 10 
Word Size: 11 
Filter: on 

Percent identity maybe measured over the length of an entire dejBned sequence, "for example, 

5 as defined by a particular SEQ ID number, or may be measured over a shorter length, for example, 
over the length of a fragment taken from a larger, defined sequence, for instance, a fragment of at 
least 20, at least 30, at least 40, at least 50, at least 70, at least 100, or at least 200 contiguous 
nucleotides. Such lengths are exemplary only, and it is understood that any fragment length supported 
by the sequences shown herein, in the tables, figures, or Sequence Listing, may be used to describe a 

10 length over which percentage identity may be measured. 

Nucleic acid sequences that do not show a high degree of identity may nevertheless encode 
similar amino acid sequences due to the degeneracy of the genetic code. It is understood that changes 
in a nucleic acid sequence can be made using this degeneracy to produce multiple nucleic add 
sequences that all encode substantially the same protein. 

15 The phrases **percent identit/' and "% identity," as applied to polypeptide sequences, refer to 

the percentage of residue matches between at least two polypeptide sequences aligned using a 
standardized algorithna. Methods of polypeptide sequence alignment are well-known. Some alignment 
methods take into accoimt conservative amino acid substitutions. Such conservative substitutions, 
explained in more detail above, generally preserve the charge and.hydrophobicity at the site of 

20 substitution, thus preserving the structure (and therefore function) of the polypeptide. 

Percent identity between polypeptide sequences may be determined using the default 
parameters of the CLUSTAL V algorithm as incorporated into the MEGAUGN version 3. 12e 
sequence alignment program (described and referenced above). For pairwise alignments of 
polypeptide sequences using CLUSTAL V, the default parameters are set as follows: Ktuple=l, gap 

25 penalty=3. window=5, and "diagonals saved"=5. The PAM250 matrix is selected as the default 
residue weight table. As with polynucleotide alignments, the percent identity is reported by 
CLUSTAL V as the "percent similarity" between aligned polypeptide sequence pairs. 

Alternatively the NCBI BLAST software suite may be used. For example, for a pairwise 
comparison of two polypeptide sequences, one may use the **BLAST 2 Sequences" tool Version 

30 2.0. 12 (April-21 -2000) with blastp set at default parameters. Such default parameters may be, for 
example: 

Matrix: BLOSUM62 

Open Gap: 11 and Extension Gap: 1 penalties 
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Gap X drop-off: 50 
Expect: 10 
Word Size: 3 
Filter: on 

5 Percent identity may be measured over the length of an entire defined polypeptide sequence, 

for example, as defined by a particular SBQ ID number, or may be measured over a shorter length, 
for example, over the length of a fi-agment taken fi-om a larger, defined polypeptide sequence, for 
mstance, a fi*agment of at least IS, at least 20, at least 30, at least 40, at least SO, at least 70 or at least 
ISO contiguous residues. Such lengths are exemplary only, and it is understood that any firagment 

10 length supported by the sequences shown herein, in the tables, figures or Sequence Listing, may be 
used to describe a length over which percentage identity maybe measured. 

'"Human artificial chromosomes" (HACs) are linear microdxromosomes which may contain 
DNA sequences of about 6 kb to 10 Mb in size and which contain all of the elements required for 
chromosome replication, segregation and maintenance. 

IS The term ''humanized antibody" refers to an antibody molecule in which the amino add 

sequence in the non-antigen blading regions has been altered so that the antibody more closely 
resembles a human antibody, and still retains its original binding ability. 

'Hybridization" refers to the process by which a polynucleotide strand anneals with a 
complementary strand through base pairing under defined hybridization conditions. Specific 

20 hybridization is an indication that two nucleic acid sequences share a higih degree of complementarity. 
Specific hybridization complexes form under permissive annealing conditions and remain hybridized 
after the ' Vashing" step(s). The washing step(s) is particularly important in determining the 
stringency of the hybridization process, with more stringent conditions allowing less non-specific 
binding, i.e., binding between pairs of nucleic acid strands that are not perfectly matched. Pennissive 

2S conditions for annealing of nucleic acid sequences are routinely determinable by one of ordinary skill in 
the art and maybe consistent among hybridization experiments, whereas wash conditions maybe 
varied among experiments to achieve the desired stringency, and therefore hybridization specificity. 
Permissive annealing conditions occur, for example, at 68*'C in the presence of about 6 x SSC, about 
1% (w/v) SDS, and about 100 /xg/ml sheared, denatured salmon sperm DNA. 

30 Generally, stringency of hybridization is expressed, in part, with reference to the temperature 

under which the wash step is carried out Such wash temperatures are typically selected to be about 
5°C to 20X lower than the thermal melting point (7^^) for the specific sequence at a defined ionic 
strength and pH. The T„ is the temperature (under defined ionic strength and pH) at which S0% of 
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the target sequence hybridizes to a perfectly matched probe. An equation for calculating and 
conditions for nucleic acid hybridization are well known and can be fonnd in Sambrook, J. et aL (1989) 
Molecular Cloning: A Laboratory ManuaL 2"* ed., vol 1-3, Cold Spring Iforbor Press, Plainview NY; 
specifically see volume 2, chapter 9. 
5 Ifigh stringency conditions for hybridization between polynucleolides of the present invention 

include wash conditions of 68°C in the presence of about 0.2 x SSC and about 0.1% SDS, for 1 hour. 
Alternatively, temperatures of about 65**C, 60*'C, 55^Cy or 42**C may be used. SSC concentration may 
be varied from about 0.1 to 2 x SSC, with SDS bemg present at about 0.1%. Typically, blocking 
reagents are used to block non-specific hybridization. Such blocking reagents include, for instance, 

10 sheared and denatured salmon sperm DNA at about 100-200 /xg/mL Organic solvent, such as 

formamide at a concentration of about 35-50% v/v, may also be used under particular circumstances, 
such as for RNA:DNA hybridizations. Useful variations on these wash conditions wiU be readily 
apparent to those of ordinary skill in the art Hybridization, particularly under high stringency 
conditions, may be suggestive of evolutionary similarity between the nucleotides. Such similarity is 

IS strongly indicative of a similar role for the nucleotides and their encoded polypeptides. 

Ihe term "hybridization complex" refers to a complex formed between two nucleic add 
sequences by virtue of the formation of hydrogen bonds between complementary bases. A 
hybridization complex may be formed in solution (e.g. , 'Cot or R^t analysis) or formed between one 
nucleic acid sequence present in solution and another nucleic acid sequence immobilized on a solid 

20 support (e.g., paper, membranes, filters, chips, pins or glass slides, or any other appropriate substrate 
to which ceDs or their nucleic acids have been fixed). 

The words "insertion" and "addition" refer to changes in an amino acid or nucleotide 
sequence resulting in the addition of one or more amino acid residues or nucleotides, respectively. 
'Immune response" can refer to conditions associated with inflammation, trauma, immune 

25 disorders, or infectious or genetic disease, etc. These conditions can be characterized by expression 
of various factors, e.g., cytokines, chemokines, and other signaling molecules, which may affect 
cellular and systemic defense systems. 

An "immunogenic fragment" is a polypeptide or oligopeptide fragment of CS AP which is 
capable of eliciting an immune response when introduced into a living organism, for example, a 

30 mammal. The teim "immunogenic fragment" also includes any polypeptide or oligopeptide fragment 
of CSAP which is useiul in any of the antibody production methods disclosed herein or known in the 
art 

The term "microarray" refers to an arrangement of a plurality of polynucleotides, 
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polypeptides, or other chemical compounds on a substrate. 

The terms "elemenf * and "array elemenf * refer to a polynucleotide, polypeptide, or other 
chemical compound having a unique and defined position on a microarray. 

The term '"modulate** refers to a change in the activity of CSAP. For example, modulation 
S may cause an increase or a decrease in protein activity, binding characteristics, or any other biological, 
functional, or immunological properties of CSAP. 

The phrases "nucleic acid" and 'Nucleic acid sequence*' refer to a nucleotide, oligonucleotide, 
polynucleotide, or any fragment thereof. These phrases also refer to DNA or RNA of genomic or 
synthetic origin which may be single-stranded or double-stranded and may represent the sense or the 
10 antisense strand, to peptide nucleic acid (PNA), or to any DNA-like or RNA-like material 

"Operably liiiked" refers to the situation in which a first nucleic add sequence is placed in a 
functional relationship with a second nucleic acid sequence. For instance, a promoter is operably 
linked to a coding sequence if the promoter affects the transcription or expression of the coding 
sequence. Operably liiiked DNA sequences may be in close proximity or contiguous and, where 
15 necessary to join two protein coding regions, in the same reading frame. 

*Teptide nucleic add" (PNA) refers to an antisense molecule or anti-gene agent which 
comprises an oligonucleotide of at least about 5 nucleotides in lengfli Imked to a peptide backbone of 
amiao acid residues ending in lysine. The terminal lysine confers solubility to the composition. PNAs 
preferentially bind complementary single stranded DNA or RNA and stop transcript elongation, and 
20 may be pegylated to extend their lifespan ia the cell. 

*Tost-translational modification' of an CSAP may involve lipidation, glycosylation, 
phosphorylation, acetylation, racemization, proteolytic cleavage, and other modifications known in the 
art These processes may occur synthetically or biochemically. Biochemical modifications wiH vary 
by cell type depending on the enzymatic mDieu of CSAP. 
25 "Probe" refers to nucleic acid sequences encoding CSAP, their complements, or fragments 

thereof, which are used to detect identical, allelic or related nucleic acid sequences. Probes are 
isolated oligonucleotides or polynucleotides attached to a detectable label or reporter molecule. 
Typical labels include radioactive isotopes, ligands, chemiluminescent agents, and enzymes. *Trimers' 
are short nucleic acids, usually DNA oligonucleotides, which may be annealed to a target 
30 polynucleotide by complementary base-pairing. The primer may then be extended along the target 
DNA strand by a DNA polymerase enzyme. Primer pairs can be used for amplification (and 
identification) of a nucleic acid sequence, e.g.,.by the polymerase diain reaction (PGR). 

Probes and priiriers as used in the present invention typically comprise at least IS contiguous 
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nucleotides of a known sequence. In order to enhance specificity, longer probes and primers may also 
be employed, such as probes and primers that comprise at least 20, 25, 30, 40, 50, 60, 70, 80, 90, 100, 
or at least 150 consecutive nucleotides of the disclosed nucleic acid sequences. Probes and primers 
may be considerably longer than these examples, and it is understood that any length supported by the 

5 specification, including the tables, figures, and Sequence Listing, may be used. 

Methods for preparing and using probes and primers are described in the references, for 
example Sambrook, J. et al. (1989) Molecular r nnnmp;' A Laboratory Manual. 2°* ed., vol 1-3, Cold 
Spring Harbor Press, Plainview NY; Ausubel, RM. et aL (1987) Current Protocols in Molecular 
Biology . Greene PubL Assoc. & Wiley-Mersciences, New York NY; Innis, M. et al. (1990) PCR 

10 Protocols. A Guide to Methods and Applications , Academic Press. San Diego C^. PCR primer pairs 
can be derived from a known sequence, for example, by using conq)uter programs intended for that 
purpose sudi as Primer (Version 0.5, 1991, Whitehead Institute for Biomedical Research, Cambridge 
MA). 

Oligonucleotides for use as primers are selected using software known in the art for such 

15 purpose. For example, OUGO 4.06 software is useful for the selection of PCR primer pairs of up to 
100 nucleotides each, and for the analysis of oligonucleotides and larger pol^ucleotides of up to 5,000 
nucleotides from an input polynucleotide sequence of up to 32 kilobases. Similar primer selection 
programs have incorporated additional features for expanded capabilities. For example, the PrimOU 
primer selection program (available to the public from the Genome Center at University of Texas 

20 South West Medical Center, Dallas TX) is capable of choosing speciiBc primers from megabase 
sequences and is thus useful for designing primers on a genome-wide scope. The PrimerS pruooer 
selection program (available to the public from the Whitehead InstituteMTT Outer for Genome 
Research, Cambridge MA) allows the user to ii:^ut a "Yoispriming library," in which sequences to 
avoid as primer binding sites are user-specified. PrimerS is useful, in particular, for the selection of 

25 oligonucleotides for microarrays. (The source code for the latter two primer selection programs may 
also be obtained from their respective sources and modified to meet the user's specific needs.) The 
PrimeGen program (available to the public from the UK Human Genome Mapping Project Resource 
Centre, Cambridge UK) designs primers based on multiple sequence alignments, thereby allowing 
selection of primers that hybridize to either the most conserved or least conserved regions of aligned 

30 nucleic acid sequences. Hence, this program is useful for identification of both unique and conserved 
oligonucleotides and polynucleotide fragments. The oligonucleotides and polynucleotide fragments 
identified by any of the above selection methods are useful in hybridization technologies, for example, 
as PCR or sequencing primers, microarray elements, or specific probes to identify fuHy or partially 
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complementary polynucleotides in a sample of nucleic acids. Methods of oligonucleotide selection are 
not limited to those described above, 

A "recombinant nucleic acid" is a sequence that is not naturally occurring or has a sequence 
that is made by an artificial combination of two or more otherwise separated segments of sequence. 
5 This artificial combination is often accomplished by chemical synthesis or, more commonly, by the 
artificial manipulation of isolated segments of nucleic adds, e.g., by genetic engineering techniques 
such as those described in Sambrook, supra . The term recombinant includes nucleic acids that have 
been altered solely by addition, substitution, or deletion of a portion of the nucleic acid. Frequently, a 
recombinant nucleic acid may include a nucleic acid sequence operably linked to a promoter sequence. 
10 Such a recombinant nucleic acid may be part of a vector that is used, for exanqile, to transform a ceE 

Alternatively, such recombmant nucleic acids may be part of a viral vector, e.g., based on a 
vaccinia virus, that could be use to vaccinate a mammal wherein the recombinant nucleic acid is 
expressed, inducing a protective immunological response in the mammal 

A '"regulatory element" refers to a nucleic acid sequence usually derived fix>m untranslated 
IS regions of a gene and racludes enhancers, promoters, introns, and 5' and 3' untranslated regions 
(UTRs). Regulatory elements interact with host or viral proteins which control transcription, 
translation, or SNA stability. 

"'Reporter molecules" are chemical or biochemical moieties used for labelmg a nucleic acid, 
amino acid, or antibody. Reporter molecules include radionuclides; enzymes; fluorescent, 
20 chemiluminescent, or chromogenic agents; substrates; cofactors; inhibitors; magnetic particles; and 
other moieties known in the art. 

An "RNA equivalent," in reference to a DNA sequence, is composed of the same linear 
sequence of nucleotides as the reference DNA sequence with the exception that all occurrences of 
the nitrogenous base thymine are replaced wifli uracil, and the sugar backbone is composed of ribose 
25 instead of deoxyribose. 

The term "sample" is used iu its broadest sense. A sample suspected of containing CSAP, 
nucleic acids encoding CSAP, or fragments thereof may comprise a bodily fluid; an extract from a 
cell, chromosome, organelle, or membrane isolated from a cell; a cell; genomic DNA, RNA, or cDNA, 
in solution or bound to a substrate; a tissue; a tissue print; etc. 
30 The terms "specific binding" and "specifically binding" refer to that interaction between a 

protein or peptide and an agonist, an antibody, an antagonist, a small molecule, or any natural or 
synthetic binding composition. The interaction is dependent upon the presence of a particular structure 
of the protein, e.g. , the antigenic determinant or epitope, recognized by the binding molecule. For 
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example, if an antibody is specific for epitope "A," the presence of a polypeptide comprising the 
epitope A, or the presence of free unlabeled A, in a reaction containing free labeled A and the 
antibody will reduce the amount of labeled A fliat binds to the antibody. 

The term "substantially purified" refers to nucleic acid or amino acid sequences that are 

5 removed firom their natural environment and are isolated or separated, and are at least 60% free, 
preferably at least 75% free, and most preferably at least 90% free from other components with 
which they are naturally associated; 

A "substitution" refers to the replacement of one or more ancdno add residues or nucleotides 
by different amino acid residues or nucleotides, respectively. 

10 "Substrate'' refers to any suitable rigid or senoi-rigid support including membranes, filters, 

chips, slides, wafers, fibers, magnetic or nonmagnetic beads, gels, tubing, plates, polymers, 
microparticles and capillaries. The substrate can have a variety of surface forms, suc^ as wells, 
trenches, pins, chaimek and pores, to which polynucleotides or polypeptides are bound. 

A '"transcript image" or "egression profile" refers to the collective pattern of gene egression 

15 by a particular cell type or tissue under given conditions at a given time. 

"Transformation" describes a process by which exogenous DNA is introduced into a recipient 
cell. Transformation may occur under natural or artificial conditions according to various methods 
. well known in the art, and may rely on any known method for the insertion of foreign nucleic acid 
sequences into a prokaryotic or eukaryotic host cell The mediod for transformiation is selected based 

20 on the type of host cell being transformed and may include, but is not limited to, bacteriophage or viral 
infection, electroporation, heat shock, lipofection, and particle bombardment The term "^transformed 
cells" includes stably transformed cells in which the inserted DNA is capable of replication either as 
an autonomously replicating plasmid or as part of the host chromosome, as well as transiently . 
transformed ceDs which express the inserted DNA or RNA for limited periods of time, 

25 A '"transgenic organism," as used herein, is any organism, including but not limited to animals 

and plants, in which one or more of the cells of the organism contains heterologous nucleic acid 
introduced by way of human intervention, such as by transgenic techniques well known in the art The 
nucleic acid is introduced into the cell, directly or indirectly by introduction into a precursor of flie cell, 
by way of deliberate genetic manipulation, such as by microinjection or by infection with a 

30 recombinant virus. The term genetic manipulation does not include classical cross-breeding, or in vitro 
fertilization, but rather is directed to the introduction of a recombinant DNA molecule. The transgenic 
organisms contemplated in accordance with the present invention include bacteria, cyanobacteria, 
fiingi, plants and animals. The isolated DNA of the present invention can be introduced into the host 
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by methods known in the art, for example infection, transfection, transformation or transconjugation. 
Techniques for transferring the DNA of the present invention into such organisms are widety known 
and provided in references such as Sambrook et al. (1989), supra . 

A 'Variant*' of a particular nucleic acid sequence is defined as a nucleic acid sequence having 
5 at least 40% sequence identity to the particular nucleic acid sequence over a certain length of one of 
the nucleic acid sequences using blastn with the **BLAST 2 Sequences" tool Version 2.0.9 (May-07- 
1999) set at default parameters. Such a p^ of nucleic acids nxay show, for example, at least 50%, at 
least 60%, at least 70%, at least 80%, at least 85%, at least 90%, at least 91%, at least 92%, at least 
93%, at least 94%, at least 95%, at least 96%,.at least 97%, at least 98%, or at least 99% or greater 

10 sequence identity over a certain defined length. A variant may be described as, for example, an 

"allelic" (as defined above), "splice," "species," or '^polymorphic" variant A splice variant may have 
significant identity to a reference noolecule, but will generally have a greater or lesser number of 
polynucleotides due to altemate splicing of exons during noRNA processing. The corresponding 
polypeptide may possess additional functional domains or lack domains that are present in the 

15 reference molecule. Species variants are polynucleotide sequences that vary from one species to 

another. The resulting polypeptides will generally have significant amino acid identity relative to each 
other. A polymorphic variant is a variation in the polynucleotide sequence of a particular gene 
between individuals of a given species. Polymorphic variants also may encompass "single nucleotide 
polymorphisms" (SNPs) in which the polynucleotide sequence varies by one nucleotide base. The 

20* presence of SNPs may be indicative of, for example, a certain population, a disease state, or a 
propensity for a disease state. 

A 'Variant • of a particular polypeptide sequence is defined as a polypeptide sequence having 
at least 40% sequence identity to the particular polypeptide sequence over a certain length of one of 
the polypeptide sequences using blastp with flie "BLAST 2 Sequences" tool Version 2.0.9 (May-07- 

25 1999) set at default parameters. Such a pair of polypeptides may show, for exan^le, at least 50%, at 
least 60%, at least 70%, at least 80%, at least 90%, at least 91%,' at least 92%, at least 93%, at least 
94%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% or greater sequence 
identity over a certain defined length of one of the polypeptides. 

30 THE INVENTION 

The invention is based on the discovery of new human cytoskeleton-associated proteins 
(CSAP), the polynucleotides encoding CSAP, and the use of these compositions for the diagnosis, 
treatment, or prevention of cell proliferative disorders, viral infections, and neurological disorders. 
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Table 1 summarizes the nomenclature for the foil length polynucleotide and polypeptide 
sequences of the invention. Each polynucleotide and its corresponding polypeptide are correlated to a 
single Ihcyte project identification number (Incyte Project ID). Each polypeptide sequence is denoted 
by both a polypeptide sequence identification number (Polypeptide SEQ ID NO:) and m Incyte 
5 polypeptide sequence number (Ihcyte Polypeptide ID) as shown. Each polynucleotide sequence is 
denoted by both a polynucleotide sequence identification number (Polynucleotide SEQ ID NO:) and an 
Incyte polynucleotide consensus sequence number (Incyte Polynucleotide ID) as shown. 

Table 2 shows sequences with homology to the polypeptides of the invention as identified by 
BLAST analysis against the GenBank protein (genpept) database. Cohmms 1 and 2 show the 
10 polypeptide sequence identification number (Polypeptide SEQ ID NO:) and the corresponding Incyte 
polypeptide sequence number (Incyte Polypeptide ID) for polypeptides of the invention. Column 3 
shows the GenBank identification number (GenBank ID NO:) of the nearest GenBank homolog. 
Column 4 shows the probability scores for the matches between each polypeptide and its homolog(s). 
Column S shows the annotation of the GenBank homolog(s) along with relevant citations wh^ 
. 15 applicable, an of which are eiqiressly incorporated by reference herein. 

Table 3 shows various structural features of the polypeptides of the invention. Columns 1 and 

2 show the polypeptide sequence identification number (SEQ ID NO:) and the corresponding Incyte 
polypeptide sequence number (Incyte Polypeptide ID) for each polypeptide of the invention. Column 

3 shows the number of amino acid residues in each polypeptide. Colunm 4 shows potential 

20 phosphorylation sites, and column 5 shows potential glycosylation sites, as determined by the MOTIFS 
program of the GCG sequence analysis software package (Gtenetics Computer Group, Madison WT). 
Column 6 shows amino acid residues comprising signature sequences, domains, and motifs. Column 7 
shows analytical methods for protein structure/function analysis and in some cases, searchable, 
databases to which the analytical methods were applied. 

25 Together, Tables 2 and 3 summarize the properties of polypeptides of the invention, and these 

properties establish tiiat the claimed polypeptides are cytoskeleton-associated proteins. For example, 
SEQ ID N0:5 is 94% identical to dog Band 4. Mike 5 protein (GenBank ID g8979743) as determined 
by the Basic Local Alignment Search Tool (BLAST). (See Table 2.) The BLAST probability score 
is 1.6e-264, which indicates the probability of obtaining the observed polypeptide sequence alignment 

30 by chance. SEQ ID NO:5 also contains a Band 4. 1 family EBRM domain as determined by searching 
for statistically significant matches in the hidden Markov model (HMM)-based PFAM database of 
conserved protein family domains. (See Table 3.) Data from BLIMPS, MOTIFS, and 
PROEDLESCAN analyses provide further corroborative evidence that SEQ ID N0:5 is a Band 4.1 
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famfly protein. In an alternative example, SEQ ID N0:7 is 95% identical to human beta-tubulin 
(GenBank ID gl805274) as determined by the Basic Local Alignment Search Tool (BLAST). (See 
Table 2.) The BLAST probability score is 5.4 6-227, which indicates the probability of obtaining the 
-- observed po^peptide sequence alignment by chance. SEQ ID NO:7 also contains a tubulin/Ftsz 

5 family domain as detemMned by searching for statistically significant matches in the hidden Markov 
model (HMM)-basedPFAM database of conserved protein family domains. (See Table 3.) Data 
from BLIMPS and MOTIFS analyses provide further corroborative evidence that SEQ ID N0:7 is a 
tubulin, In an alternative example, SEQ ID NO: 11 is 80% identical, from residue Ml to residue G529, 
to Mus musculus type n cytokeratin (GtenBank ID g6092075) as determined by the Basic Local 

10 Alignment Search Tool (BLAST). (See Table 2.) The BLAST probability score is 2.5e-213, which 
indicates the probability of obtaming the observed polypeptide sequence alignment by chance. SEQ 
ID N0:11 also contains an intermediate filament domain as determined by searching for statistically 
significant matches in the hidden Markov model CHMM)-based PFAM database of conserved protein 
family domains. (See Table 3.) Data from BUMPS, MOTIFS, and PROFILESCAN analyses 

15 provide furOier corroborative evidence that SEQ ID NO:l 1 is an intermediate filament protein. In an 
alternative example, SEQ ID N0:17 is 90% identical, fix)m residue Ml to residue 1888, to Mus 
mnscuhis POSH protein (CSenBank ID g3002588) as determined by the Basic Local Alignment Search 
Tool (BLAST). (See Table 2.) The BLAST probability score is 0.0, which indicates the probability of. 
obtaining the observed polypeptide sequence alignment by chance. SEQ ID NO: 17 also contains an 

20 SH3 domain as determined by searching for statistically significant matches in the hidden Markov 
model (HMM>-based PFAM database of conserved protein family domains. (See Table 3.) Data 
firom BLIMPS and MOTIFS analyses provide fiirther corroborative evidence that SEQ ID NO:17 is 
an SH3-containing protein. SEQ ID N0:l-4, SEQ ID N0:6, SEQ ID NO:8-10, SEQ ID NO:12-16 
and SEQ ID NO: 18 were analyzed and annotated in a similar manner. The algorithms and parameters 

25 for the analysis ofSEQ ID N0:l-18 are described in Table 7. 

As shown in Table 4, the fiill length polynucleotide sequences of the present invention were 
assembled using cDNA sequences or coding (exon) sequences derived fi-om genomic DNA, or any. 
combination of these two types of sequences. Column 1 lists the polynucleotide sequence 
30 identification number (Polynucleotide SEQ ID NO:), the corresponding Incyte polynucleotide 

consensus sequence number (Incyt? ID) for each polynucleotide of the invention, and the length of 
each polynucleotide sequence in basepairs. Column 2 shows the nucleotide start (5') and stop (3') 
positions of the cDNA and/or genomic sequences used to assemble the fiiU length polynucleotide 
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sequences of the invention, and of fragments of .the polynucleotide sequences which are useful, for 
example, in hybridization or amplification technologies that identify SEQ ID NO:19-36 or that 
distinguish between SEQ ID NO:19-36 and related polynucleotide sequences. 

The polynucleotide fragments described in Column 2 of Table 4 may refer specifically, for 

5 example, to Incyte cDNAs derived from tissue-specific cDNA libraries or from pooled cDNA 
libraries. Altemafively, the polynucleotide fragments described in cohmn 2 may refer to GenBank 
cDNAs or ESTs which contributed to the assembly of the full length polynucleotide sequences. In 
addition, the polynucleotide fragments described in column 2 may identify sequences derived from the 
ENSEMBL (The Sanger Centre, Cambridge, UK) database (Le. , those sequences including the 

10 designation 'BNST*'). Alternatively, the polynucleotide fragments described in colunan 2 may be 
derived from the NCBI RefSeq Nucleotide Sequence Records Database (Le, , those sequences 
including (he designation '"MM" or **NT') or the NCBI RefSeq Protein Sequence Records (ie., those 
sequences including the designation "'NP'). Alternatively, the polynucleotide fragments described in 
colunooi 2 may refer to assemblages of both cDNA and Genscan-predicted exons brought together by 

IS an '"exon stitching' algorithniL For example, a polynucleotide sequence identified as 

VLJOOQaOLJiiJi^YYYYYJSI^^ represents a "stitched'* sequence in whidh XXXXXX is flie 
identification number of the cluster of sequences to which the algorithm was applied, and TYYYYis the 
number of the prediction generated by the algorithm, and iV^^,5..., if present, represent specific exons 
that may have been manually edited during analysis (See Example V). Alternatively, the 

20 polynucleotide fragments in column 2 may refer to assemblages of exons brought together by an 
"exon-stretching" algorithm. For example, a polynucleotide sequence identified as 
FLXXXXXX ^^^gMAAA^BBBBJi JVis a "stretched" sequence, withXXXjOXbeing the Incyte 
project identification number, gAAAAA being the GenBank identification number of the human 
genomic sequence to which the "exon-stretching" algorithm was applied, gSBBBB being the GenBank 

25 identification number or NCBI RefSeq identification number of the nearest GenBank protein homolog, 
and N referring to specific exons (See Example V). In instances where a RefSeq sequence was used 
as a protein homolog for the "exon-stretching" algorithm, a RefSeq identifier (denoted by "NM," 
*TNP," or ^W) may be used m place of the GenBank identifier {Le, , gJBBBBB). 

Alternatively, a prefix identifies component sequences fliat were hand-edited, predicted from 

30 genomic DNA sequences, or derived from a combintation of sequence analysis metiiods. The 

following Table lists exarq)les of component sequence prefixes and corresponding sequence analysis 
methods associated with the prefixes (see Example IV and Example V). 



32 



wo 02/053719 



PCTAJS02/00178 



Prefix 


Type of analysis and/or examples of programs 


GNN.GFG, 
ENST 


Exon prediction from genomic sequences using, for example, 
GENSCAN (Stanford University, CA, USA) or FGENES 
(Computer Genomics Group, The Sanger Centre, Cambridge, UK) 


GBI 


Hand-edited analysis of genomic sequences. 


FL 


Stitched or stretched genomic sequences (see Example V). 


DSfCY 


PuH length transcript and exon prediction from mapping of EST 
sequences to the genome. Genomic location and EST composition 
data are combined to predict the exons and resulting transcript. 



In some cases, Incyte cDNA coverage redundant with the sequence coverage shown in 
Table 4 was obtaiaed to confirm the final consensus polynucleotide sequence, but the relevant Incyte 

10 cDNA identification numbers are not shown. 

Table 5 shows the representative cDNA libraries for those full length polynucleotide 
sequences which were assembled using Incyte cDNA sequences. The representative cDNA library 
is the Incyte cDNA library which is most frequently represented by the Incyte cDNA sequences 
which were used to assemble and confimi the above polynucleotide sequences. The tissues and 

15 vectors which were used to construct the cDNA libraries shown in Table 5 are described in Table 6. 

The invention also encompasses CS AP variants. A preferred CS AP variant is one which has 
at least about 80%, or akematively at least about 90%, or even at least about 95% amino acid 
sequence identity to the CS AP amino acid sequence, and which contains at least one functional or 
stmctural characteristic of CSAP. 

20 The invention also encompasses polynucleotides whidh encode C^AP. In a particular 

embodiment, the invention encompasses a polynucleotide sequence comprising a sequence selected 
from the group consisting of SEQ ID NO:19-36, which encodes CSAP. The polynucleotide 
sequences of SEQ ID NO:19-36, as presented in the Sequence Listing, embrace the equivalent RNA 
sequences, wherein occurrences of the nitrogenous base thymine are replaced with uracil, and the 

25 sugar backbone is composed of ribose instead of deo:8yribose. 

The invention also encompasses a variant of a polynucleotide sequence encoding CSAP. In 
particular, such a variant polynucleotide sequence will have at least about 70%, or alternatively at least 
about 85%, or even at least about 95% polynucleotide sequence identity to the polynucleotide 
sequmce encoding CSAP. A particular aspect of the invention encompasses a variant of a 

30 polynucleotide sequence comprising a sequence selected from the group consisting of SEQ ID NO:19- 
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36 which has at least about 709b, or alternatively at least about 85%, or even at least about 95% 
polynucleotide sequence identity to a nucleic acid sequence selected firom the group consisting of S£Q 
ID NO:19-36. Any one of the polynucleotide variants described above can encode an amino acid 
sequence which contains at least one functional or structural characteristic of CS AP. 
5 In addition, or in the alternative, a polynucleotide variant of the invention is a splice variant of a 

polynucleotide sequence encoding CSAP. A splice variant may have portions which have significant 
sequence identity to the polynucleotide sequence encoding CSAP, but will generally have a greater or 
lesser number of polynucleotides due to additions or deletions of blocks of sequence arising firom 
alternate splicing of exons during mSNA processing. A splice variant may have less than about 70%, 

10 or alternatively less than about 60%, or alternatively less than about 50% polynucleotide sequence 
identity to the polynucleotide sequence encoding CSAP over its entire length; however, portions of the 
splice variant will have at least about 70%, or alternatively at least about 85%, or alternatively at least 
about 95%, or alternatively 100% polynucleotide sequence identity to portions of the polynucleotide 
sequence encoding CSAP. Any one of the splice variants described above can encode an amino acid 

IS sequence which contains at least one functional or structural characteristic of CSAP. 

It will be appreciated by those skilled in the art that as a result of the degeneracy of the 
genetic code, a multitude of polynucleotide sequences encoding CSAP, some bearing mmimal 
similarity to the polynucleotide sequences of any known and naturally occurring gene, may be 
produced. Thus, the invention contemplates each and every possible variation of polynucleotide 

20 sequence that could be made by selecting combinations based on possible codon choices. These 
combinations are made in accordance with the standard triplet genetic code as applied to the 
polynucleotide sequence of naturally occurring CSAP, and all such variations are to be considered as 
being specifically disclosed. 

Although nucleotide sequences which encode CSAP and its variants are generally capable of 

25 hybridizing to the nucleotide sequence of the naturally occurring CSAP under appropriately selected 
conditions of stringency, it maybe advantageous to produce nucleotide sequences encoding CSAP or 
its derivatives possessing a substantially different codon usage, e.g., inclusion of non-naturally 
occurring codons. Codons maybe selected to increase the rate at which expression of the peptide 
occurs in a particular prokaryotic or eukaryotic host in accordance with the fi^quency with which 

30 particular codons are utilized by the host Other reasons for substantially altering flie nucleotide 
sequence encoding CSAP and its derivatives without altering the encoded amino acid sequences 
include the production of RNA transcripts having more desirable properties, such as a greater half-life, 
than transcripts produced fit>m die naturally occurring sequence. 
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The invenlioii also encompasses production of DNA sequences which encode CSAP and 
CSAP derivatives, or fragments thereof, entirely by synthetic chemistry. After production, flie 
synthetic sequence may be inserted into any of tiie many available expression vectors and cell systems 
using reagents well known in the art Moreover, synthetic chemistry may be used to introduce 

5 mutations into a sequence encoding CSAP or any fragment thereof 

Also encompassed by the invention are polynucleotide sequences that are capable of 
hybridizing to the claimed polynucleotide sequences, and, in particular, to those shown in SEQ ID 
NO:19-36 and fragments thereof under various conditions of stringency. (See, e.g., Wahl, G.M. and 
S.L. Berger (1987) Methods Enzymol. 152:399-407; Kfamnel, A.R. (1987) Methods Enzymol 152:507- 

10 511.) Hybridization conditions, including annealing and wash conditions, are described in 'IDefinitions.! 
Methods for DNA sequencing are well known in the art and may be used to practice any of 
the embodiments of the invention. The methods may employ such enzymes as the Klenow fragment 
of DNA polymerase I, SEQUENASE (US Biochemical, Qeveland OH), Taq polymerase (Applied 
Biosystems), thermostable T7 polymerase (Amersham Pharmacia Biotech, Rscataway NJ), or 

IS combinations of polymerases and proofreading exonucleases such as those foimd in the ELONGASE 
amplification system (Life Technologies, Gaithersburg MD). Preferably, sequence preparation is 
automated with machines such as the MICROLAB 2200 liquid transfer system (Hamilton, Reno NV), 
PTC200 thermal cycler (MJ Research. Watertown MA) and ABI CATALYST 800 thermal cycler 
(Applied Biosystems). Sequencing is then carried out using either the ABI 373 or 377 DNA 

20 sequencing system (Applied Biosystems), the MEGABACE 1000 DNA sequencing system 

(Molecular Dynamics, Sunnyvale CA), or other systems known in the art. The resulting sequences 
are analyzed using a variety of algorithms which are well known in the art (See, e.g., Ausubel, RM. 
(1997) Short Protocols in Molecular Biology , John Wiley & Sons, New York NY, unit 7.7; Meyers, . 
R.A. (1995) Molecular Biology and Biotechnology . Wiley VOH, New York NY, pp. 856-853.) 

25 The nucleic acid sequences encoding CSAP may be extended utilizing a partial nucleotide 

sequence and employing various PCR-based methods known in the art to detect upstream sequences, 
such as promoters and regulatory elements. For example, one method which maybe employed, 
restriction-site PGR, uses universal and nested primers to amplify unknown sequence from genomic 
DNA within a cloning vector. (See, e.g., Sarkar, G. (1993) PGR Methods Applic. 2:318-322.) 

30 Another method, inverse FOR, uses primers that extend in divergent directions to amplify unknown 
sequence from a circularized template. The template is derived from restriction fragments comprising 
a known genonoic locus and surrounding sequences. (See, e.g., Triglia, T. et aL (1988) Nucleic Acids 
Res. 16:8186.) A third me&od, capture PGR, involves PGR amplification of DNA fragments adjacent 
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to known sequences in human and yeast ardiicial chromosome DNA. (See, e.g., Lagerstrom, M. et 
al. (1991) PGR MefliodsApplic. 1:111-119.) In this method, multiple restriction enzyme digestions and 
ligations maybe used to insert an engineered double-stranded sequence into a region of unknown 
sequence before performing ?CR. Other methods which may be used to retrieve unknown sequences 

5 are known in flie art (See, e.g., Parker, JD. et al. (1991) Nucleic Acids Res. 19:3055-3060). 

AdditionaUy, one may use PGR, nested primers, and PROMOTEREINDER libraries (Clontech, Palo 
Alto CA) to walk genomic DNA. Ihis procedure avoids the need to screen libraries and is useful in 
finding intron/exon junctions. For all PCR-based methods, primers may be designed using 
commercially available software, such as OLIGO 4.06 primer analysis software'(National 

10 Biosciences, Plymouth MN) or another appropriate program, to be about 22 to 30 nucleotides in length, 
to have a GG content of about 50% or more, and to anneal to the template at temperatures of about 
68°Gto72°C 

When screening for fiill lengOi cDNAs, it is preferable to use libraries that have been 
size-selected to include larger cDNAs. In addition, random-primed libraries, which often include 

15 sequences containing the 5' regions of genes, are preferable for situations m which an oligo d(T) 

library does not yield a ftiU-length cDNA. Genomic libraries may be useful for extension of sequence 
into 5* non-transcribed regulatory regions. 

Capillary electrophoresis systems which are commercially available may be used to analyze 
the size or confirm the nucleotide sequence of sequencing or PGR products. In particular, capillary 

20 sequenciug may employ flowable polymers for electrophoretic separation, four different nucleotide- 
specific, laser-stimulated fluorescent dyes, and a charge coupled device camera for detection of the 
emitted wavelengths. Output/ligjit iatensity may be converted to electrical signal using appropriate 
software (e.g., GENOTYPER and SEQUENCE NAVIGATOR, Applied Biosystems), and the entire 
process fi:om loading of samples to computer analysis and electronic data display may be computer 

25 controlled. Capillary electrophoresis is especially preferable for sequencing small DNA firagments 
which may be present in limited amounts in a particular sample. 

In another embodiment of the invention, polynucleotide sequences or fragments thereof which 
encode CS AP may be cloned in recombinant DNA molecules that direct expression of CS AP, or 
fragments or functional equivalents thereof, in appropriate host cells. Due to the inherent degeneracy 

30 of the genetic code, other DNA sequences which encode substantially the same or a ftmctionally 
equivalent amino add sequence may be produced and used to express CSAP. 

The nucleotide sequences of the present invention can be engineered using mediods generally 
known in the art in order to alter CSAP-encoding sequences for a variety of purposes including, but 
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not limited to, modification of the cloning, processing, and/or expression of the gene product DNA 
shuffling by random fragmentation and TCR reassembly of gene fragments and synthetic 
oligonucleotides may be used to engineer the nucleotide sequences. For example, oligonucleolide- 
mediated site-directed mutagenesis may be used to introduce mutations &at create new restriction 

5 sites, alter glycosylation patterns, change codon preference, produce splice variants, and so forth. 

The nucleotides of the present invention may be subjected to DNA shuffling techniques such 
as MOLECULARBREEDING (Maxygen Inc., Santa Clara CA; described in U.S. Patent No. 
5,837,458; Chang, C.-C. et al. (1999) NaL Biotedmol. 17:793-797; Christians, RC. et al. (1999) Nat 
Biotedmol. 17:259-264; and Crameri, A. et al (1996) Nat BiotechnoL 14:315-319) to alter or improve 

10 the biological properties of CS AP, such as its biological or enzymatic activity or its ability to bind to 
other molecules or compounds. DNA shuffling is a process by which a library of gene variants is 
produced using PCR-mediated recombination of gene fragments. The library is then subjected to 
selection or screening procedures that identify those gene variants with the desired properties. These 
preferred variants may then be pooled and iurther subjected to recursive rounds of DNA shuffling and 

15 selection/screening. Thus, genetic diversity is created through "artificial" breeding and rapid molecular 
evolution. For example, fragments of a single gene containing random point mutations maybe 
recombined, screened, and then reshuffled until the desired properties are optimized. Alternatively, 
fragments of a given gene may be recombined with fragments of homologous genes in the same gene 
family, either from the same or different species, thereby maximizing the genetic diversity of multiple 

20 naturally occurring genes ia a directed and controllable manner. 

In another embodiment, sequences encoding CSAP may be synthesized, in whole or in part, 
using chemical methods well known in the art (See, e.g., Caruthers, M.H. et al. (1980) Nucleic Acids 
Symp. Ser. 7:215-223; and Hom, T. et al. (1980) Nucleic Acids Symp. Ser. 7:225-232.) Alternatively, 
CSAP itself or a fragment thereof may be synthesized using chemical methods. For example, peptide 

25 synthesis can be performed using various solution-phase or solid-phase techniques. (See, e.g., 

Creighton, T. (1984) Proteins, Structures and Molecular Properties . WH Freeman, New York NY, pp. 
55-60; and Roberge, J.Y. et aL (1995) Science 269:202-204.) Automated synthesis maybe achieved 
using the ABI 431 A peptide synthesizer (Applied Biosystems). Additionally, the amino acid sequence 
of CSAP, or any part thereof, may be altered during direct synthesis and/or combined with sequences 

30 from other proteins, or any part thereof, to produce a variant polypeptide or a polypeptide having a 
sequence of a naturally occurring polypeptide. 

The peptide may be substantially purified by preparative higjh performance liquid 
chromatography. (See. e.g., Cbiez, RM. and F.Z. Regnier (1990) Me&ods Enzymol. 182:392-421.) 
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The composition of the synthetic peptides may be confixmed by amino acid analysis or by sequencing. 
(See, e.g., Creighton, snpra. pp. 28-53.) 

In order to e:q)ress a biologically active CS AP, the nucleotide sequences encoding CSAP or 
derivatives thereof may be inserted into an appropriate e?q>ression vector, i.e., a vector which contains 

5 the necessary elements for transcriptional and translational control of the inserted coding sequence in 
a suitable host. These elements include regulatory sequences, such as enhancers, constitutive and 
inducible promoters, and 5' and 3 'untranslated regions in the vector and in polynucleotide sequences 
encoding CSAP. Such elements may vary in their strength and specificity. Specific initiation signals 
may also be used to achieve more efficient translation of sequences encoding CSAP. Such signals 

10 include the ATG initiation codon and adjacent sequences, e.g. the Kozak sequence. In cases where 
sequences encoding CSAP and its initiation codon and upstream regulatory sequences are inserted 
into the appropriate expression vector, no additional transcriptional or translational control signals may 
be needed. However, in cases where only coding sequence, or a fragment thereof, is inserted, 
exogenous translational control signals including an in-fi-ame ATG initiation codon should be provided 

15 by the vector. Exogenous translatioiial elements and initiation codons may be of various origins, both 
natural and synthetic. The efficiency of expression may be enhanced by die inclusion of enhancers 
appropriate for the particular host cell system used. (See, e.g., Scharf, D. et al. (1994) Results ProbL 
Cell Differ. 20:125-162.) 

Methods which are well known to those sldlled in the art maybe used to construct expression 

20 vectors containing sequences encoding CSAP and appropriate transcriptional and translational control 
elements. These methods include in vitro recombinant DNA techniques, synthetic techniques, and in 
vivo genetic recombination- (See, e.g., Sambrook, J. et al. (1989) Molecular Cloning, A Laboratory . 
Manual , Cold Spring Harbor Press, Plainview NY, ch. 4, 8, and 16-17; Ausubel, P.M. et al. (1995) 
Current Protocols in Molecular Biology , John Wiley & Sons, New York NY, ch. 9, 13, and 16.) 

25 A variety of expression vector/host systems may be utilized to contain and express sequences 

encoding CSAP. These include, but are not limited to, microorganisms such as bacteria transformed 
with recombinant bacteriophage, plasmid, or cosmid DNA expression vectors; yeast transformed with 
yeast e:q>ression vectors; insect cell systems infected with viral e:q)ression vectors (e.g., baculovirus); 
plant cell systems transformed with viral egression vectors (e.g., cauliflower mosaic virus, CaMV, or 

30 tobacco mosaic virus, TMV) or with bacterial expression vectors (e.g., Ti or pBR322 plasmids); or 
animal cell systems. (See, e.g., Sambrook, supra ; Ausubel, supra : Van Heeke, G. and S.M. Schuster 
(1989) J. Biol. Chem. 264:5503-5509; Engplhard, E.K. et al. (1994) Proc. NatL Acad. Sci. USA 
. 91:3224-3227; Sandig, V. et aL (1996) Hum. Gene Ther. 7:1937-1945; Takamatsu, N. (1987) EMBO 
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J. 6:307-311; The McGraw HGll Yearbook of Science and Technology (1992) McGraw HHl, New 
York NY, pp. 191-196; Logan, J. and T. Shenk (1984) Proc. Nafl. Acad. Sci. USA 81:3655-3659; and 
Harrington, J. J. et al. (1997) Nat. Genet 15:345-355.) Egression vectors derived from retroviruses, 
adenoviruses, or herpes or vaccinia viruses, or from various bacterial plasmids, may be used for 

5 delivery of nucleotide sequences to the targeted organ, tissue, or cell population. (See, e.g., Di Nicola, 
M. et aL (1998) Cancer Gen. Ther. 5(6):350-356; Yn, M. et al. (1993) Proc. Nad Acad. Sci. USA 
90(13):6340-6344; Bnller, R.M. et al. (1985) Nature 317(6040):813-815; McGregor, D.P. et al. (1994) 
Mol. Immunol. 31(3):219-226; and Verma, I.M. and N. Somia (1997) Nature 389:239-242.) The 
invention is not linoited by the host cell employed. 

10 In bacterial systems, a number of cloning and expression vectors may be selected depending 

upon the use intended for polynucleotide sequences encoding CSAP. Fbr example, routine cloning, 
subcloning, and propagation of polynucleotide sequences encoding CSAP can be achieved using a 
multifiinctional E. coli vector such as PBLUESCRIPT (Stratagene, La Jolla CA) or PSPORTl 
plasnaid (Life Technologies). Ligation of sequences encoding CSAP into the vector's muWple cloning 

15 site disrupts the ladL gene, allowing a oolorimetric screening procedure for identification of 

• transformed bacteria containing recombinant molecules. In addition, these vectors may be useful for 
in vitro transcription, dideoxy sequencing, single strand rescue with helper phage, and creation of 
nested deletions in the cloned sequence. (See, e.g.. Van Heeke, G. and S.M. Schuster (1989) J. Biol 
Chem. 264:5503-5509.) When large quantities of CSAP are needed, e.g. for the production of 

20 antibodies, vectors which direct high level expression of CSAP may be used. For example, vectors 
containing the strong, inducible SP6 or T7 bacteriophage promoter may be used. 

Yeast expression systems may be used for production of CSAP. A number of vectors 
containing constitutive or inducible promoters, such as alpha factor, alcohol oxidase, and PGH 
promoters, may be used in the yeast Saccharomvces cerevisiae or Pichia pastoris . In addition, such 

25 vectors direct either the secretion or intracellular retention of expressed proteins and enable integration 
of foreign sequences into the host genome for stable propagation. (See, e.g., Ausubel, 1995, supra ; 
Bitter, G.A. et aL (1987) Methods EnzymoL 153:516-544; and Scorer, CA. et aL (1994) 
Bio/Technology 12:181-184.) 

Plant systems may also be used for expression of CSAP. Transcription of sequences 

30 encoding CSAP may be driven by viral promoters, e.g., the 35S and 19S promoters of CaMV used 
alone or in combination with the omega leader sequence from TMV (Takamatsu, N. (1987) EMBO J. 
6:307-3 11). Alternatively, plant promoters such as the small subunit of RUBISCO or heat shock 
promoters may be used. (See, e.g., Coruza, G. et aL (1984) EMBO J. 3:1671-1680; Brogjie, R. et aL 
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(1984) Science 224:838-843; and Winter, J. et aL (1991) Results Probl. Cefl Differ. 17:85-105.) These 
constructs can be introduced into plant cells by direct DNA transfonnatton or pathogen-mediated 
transfection. (See, e.g., The McGraw Hin Yearbook of Science and Technology (1992) McGraw Hill, 
New York NY, pp. 191-196.) 

5 In mammalian cells, a number of viral-based expression systems may be utilized. In cases 

where an adenovirus is used as an e?cpression vector, sequences encoding CS AP may be ligated into 
an adenovirus transcription/translation complex consisting of the late promoter and tripartite leader 
sequence. Insertion in a non-essential El or B3 region of the viral genome may be used to obtain 
infective virus which expresses CSAP in host cells. (See, e.g., Logan, J. and T. Shenk (1984) Proc. 

10 Nad. Acad. Sci. USA 81:3655-3659.) In addition, transcription enhancers, such as the Rous sarcoma 
virus (RSV) enhancer, may be used to increase expression in mammalian host cells. SV40 or EBV- 
based vectors may also be used for high-level protein eiipression. 

Human artificial chromosomes (HACs) may also be employed to deliver larger fragments of 
DNA than can be contained in and expressed from a plasmid. HACs of about 6 kb to 10 Mb are 

15 constructed and delivered via conventional delivery methods (liposomes, polycationic amino polymers, 
or vesicles) for therapeutic purposes. (See, e.g., Harrington, J.J. et al. (1997) Nat. Genet. 15:345- 
355.) 

For long term production of recombinant proteins in mammalian systems, stable expression of 
CSAP in cell lines is preferred. For example, sequences encoding CSAP can be transformed into cell 

20 lines using expression vectors which may contain viral origins of replication and/or endogenous 

expression elements and a selectable marker gene on the same or on a separate vector. Following the 
introduction of the vector, cells may be allowed to grow for about 1 to 2 days in enriched media before 
being switched to selective media. The purpose of the selectable marker is to confer resistance to a 
selective agent, and its presence allows growth and recovery of cells which successfully express the 

25 introduced sequences. Resistant clones of stably transformed cells may be propagated using tissue 
culture techniques appropriate to the cell type. 

Any number of selection systems may be used to recover transformed cell lines. These 
include, but are not limited to, the herpes simplex virus thymidme kinase and adenine 
phosphoribosyltransferase genes, for use in tk and apr cells, respectively. (See, e.g., Wigler, M. et 

30 aL (1977) Cell 11:223-232; Lowy, L et al. (1980) Cefl 22:817-823.) Also, antimetabolite, antibiotic, or 
herbicide resistance can be used as the basis for selectioiL For example, dhfr confers resistance to 
methotrexate; neo confers resistance to the aminoglycosides neomycin and G-418; and als and pat 
confer resistance to chlorsulfuron and phosphinotricin acetyltransferase, respectively. (See, e.g.. 
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Wigler, M. et al. (1980) Proc. Natl. Acad. Sci. USA 77:3567-3570; Colbere-Garapin, R et al. (1981) 
J. Mol. Biol. 150:1-14.) Additional selectable genes have been described, e.g., trpB and hisD, which 
alter cellular requh^ments for metabolites. (See, e.g.,'Hartinan, S.C and R.C. MuDigan (1988) Proc. 
Nail. Acad. ScL USA 85:8047-8051.) YvsMq markers, e.g., anlhocyanins, green fluoiescent protems 

5 (GFP; Qontech), B gjhicuronidase and its substrate fi-ghicuronide, or hiciferase and its substrate 
ludferin may be used. These markers can be used not only to identify transformants, but also to 
quantify the amount of transient or stable protein expression attributable to a specific vector systenL 
(See, e.g., Rhodes, C.A. (1995) Methods Mol. Biol. 55:121-131.) 

Although the presence/absence of marker gene expression suggests that the gene of interest 

10 is also present, the presence and expression of the gene may need to be confirmed. For example, if 
the sequence encoding CSAP is inserted within a marker gene sequence, transformed cells containing 
sequences encoding CSAP can be identified by the absence of marker gene function. Alternatively, a 
marker gene can be placed in tandem with a sequence encoding CSAP under the control of a single 
promoter. Expression of the marker gene in response to induction or selection usually indicates 

15 expression of the tandem gene as well. 

In general, host ceOs that contain the nucleic acid sequence encoding CSAP and that express 
CSAP msiy be identified by a variety of procedures known to those of sHQ in the art These 
procedures include, but are not limited to, DNA-DNA or DNA-RNA hybridizations, PCR 
amplification, and protein bioassay or immunoassay techniques which include membrane, solution, or 

20 chip based technologies for the detection and/or quantification of nucleic add or protein sequences. 

Immunological methods for detecting and measuring the expression of CSAP using either 
specific polyclonal or monoclonal antibodies are loiown in the art. Examples of such techniques 
include enzyme-linked immunosorbent assays (EUSAs), radioinimunoassays (RIAs), and 
fluorescence activated cell sorting (FACS). A two-site, monoclonal-based immunoassay utilizing 

25 monoclonal antibodies reactive to two non-interfering epitopes on CSAP is preferred, but a 

competitive binding assay may be employed. These and other assays are well known in the art. (See, 
e.g., Hampton, R. et al. (1990) Serological Methods, a Laboratory Manual APS Press, St Paul MN, 
Sect IV; CoUgan, J.E. et al. (1997) Current Protocols in Immunology , Greene Pub. Associates and 
Wiley-Interscience, New York NY; and Pound, J.D. (1998) Immunochemical Protocols . Humana 

30 Press, TotowaNJ.) 

A wide variety of labels and conjugation techniques are known by those skilled in the art and 
maybe used in various nucleic acid and amino acid assays. Means for producing labeled hybridization 
or PCR probes for detecting sequences related to polynucleotides encoding CSAP include 
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oligolabelitig, nick translation, end-labeling, or PGR amplification using a labeled nucleotide. 
Alternatively, the sequences encoding CS AP, or any fi^gments thereof, may be cloned into a vector 
for the production of an mSNA probe. Such vectors are known in the art, are commercially available, 
and may be used to synthesize RNA probes in vitro by addition of an appropriate RNA polymerase 

5 such as T7, T3, or SP6 and labeled nucleotides. These procedures may be conducted using a variety 
of commercially available kits, such as those provided by Amersham Pharmacia Biotech, Promega 
(Madison WI), and US Biochemical. Suitable reporter molecules or labels which may be used for 
ease of detection include radionuclides, enzymes, fluorescent, cbemilaminescent, or chromogenic 
agents, as well as substrates, cofactors, iiihibitors, magnetic particles, and the like. 

10 Host cells transformed with nucleotide sequences encoding CSAP may be cultured under 

conditions suitable for ^e e^qpression and recovery of the protein from cell culture. The protein 
produced by a transformed cell may be secreted or retained intracelhilarly dependmg on the sequence 
and/or the vector used. As wiU be understood by those of skill in the art, e]q>ression vectors containing 
polynucleotides which encode CSAP may be designed to contain signal sequences which direct 

15 secretion of CSAP through a prokaryotic or eukaryotic cell membrane. 

In addition, a host cell strain may be chosen for its ability to modulate expression of the 
inserted sequences or to process the expressed protein in the desired fashion. Such modifications of 
the polypeptide include, but are not limited to, acetylation, carboxylation, glycosylation, phosphorylation 
lipidation, and acylation. Post-translational processing which cleaves a "prepro" or "pro" form of the 

20 protein may also be used to specify protein targeting, folding, and/or activity. Different host cells 
which have specific cellular machinery and characteristic mechanisms for post-translational activities 
(e.g., CHO, HeLa, MDCK, HEK293, and WB8) are available from the American Type Culture 
Collection (ATCC, Manassas VA) and may be chosen to ensure the correct modification and 
processing of the foreign protein. 

25 In another embodiment of the invention, natural, modified, or recombinant nucleic acid 

sequences encoding CSAP may be ligated to a heterologous sequence resulting in translation of a 
fusion protein in any of the aforementioned host systems. For example, a chimeric CSAP protein 
containing a heterologous moiety that can be recognized by a commercially available antibody may 
facilitate the screening of peptide libraries for inhibitors of CSAP activity. Heterologous protein and 

30 peptide moieties may also facilitate purification of fusion proteins using coimnercially available affinity 
matrices. Such moieties include, but are not limited to, glutathione S-transferase (GST), maltose 
binding protein (MBP), thioredoxin (Trx), calmodulin binding peptide (CBP), 6-His, FLAG, c-myc, and 
hemagglutinin (HA). GST, MBP, Trx, CBP, and 6-Hi5 enable purification of their cognate fusion 
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proteins on immobilized ghitafhione, maltose, phenylarsine oxide» calmodulin, and metal-chelate rcsins, 
respectively. FLAG, c-myc, and hemagglutinin (HA) enable immunoaffinity purification of fusion 
proteins using commercially available monoclonal and polyclonal antibodies fhat specifically recognize 
these epitope tags. A fusion protein may also be eng^eered to contain a proteolytic cleavage site 

S located between the CSAP encoding sequence and the heterologous protein sequence, so that CSAP 
may be cleaved away from the heterologous moiety following purification. Methods for fusion protein 
expression and purification are discussed in Ausubel (1995, supra , di. 10). A variety of conomercially 
available kits may also be used to facilitate expression and purification of fusion proteins. 

In a further embodiment of the invention, synthesis of radiolabeled CSAP may be achieved in 

10 vitro using the TNT rabbit reticulocyte lysate or wheat germ extract system (Promega). These 

systems couple transcription and translation of protein-<:oding sequences operably associated with the 
T7, T3, or SP6 promoters. Translation takes place in the presence of a radiolabeled amino acid 
precursor, for example, ^S-methionine. 

CSAP of the present invention or fragments thereof may be used to screen for compounds 

15 fhat specifically bind to CSAP. At least one and up to a plurality of test compounds may be screened 
for specific binding to CSAP. Examples of test compounds include antibodies, oligonucleotides, 
proteins (e.g., receptors), or small molecules. 

In one embodiment, the compound thus identified is closely related to the natural ligand of 
CSAP, e.g., a ligand or fi^agment thereof, a natural substrate, a structural or functional mimetic, or a 

20 natural binding partner. (See, e.g., Coligan, J.E. et al. (1991) Current Protocols in Immunology 1(2): 
Chapter 5.) Similarly, the compound can be closely related to the natural receptor to which CSAP 
binds, or to at least a fragment of the receptor, e.g., the ligand binding site. In either case, the 
compound can be rationally designed using known techniques. In one embodiment, screening for 
these compounds involves producing appropriate cells which express CSAP, either as a secreted 

25 proteia or on the cell membrane. Preferred cells include cells fi-om mammals, yeast, Drosophila. or E. 
coli . Cells expressing CSAP or cell membrane fractions which contain CSAP are then contacted with 
a test compound and binding, stimulation, or inhibition of activity of either CSAP or the compound is 
analyzed. 

An assay may simply test binding of a test compound to the polypeptide, wherein binding is 
30 detected by a fluorophore, radioisotope, enzyme conjugate, or other detectable label For example, the 
assay may comprise the steps of combining at least one test compound with CSAP, either in solution 
or affixed to a solid support, and detecting the bmding of CSAP to the compound. Alternatively, the 
assay may detect or measure binding of a test compound ia the presence of a labeled competitor. 
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Additionally, the assay may be carried out using cell-free preparations, chemical libraries, or natural 
product mixtures, and the test compound(s) maybe free in solution or affixed to a solid support. 

CSAP of the present invention or fragments thereof may be used to screen for compoimds 
that modulate the activity of CSAP. Such compounds may include agonists, antagonists, or partial or 

S inverse agonists. In one embodiment, an assay is performed under conditions permissive for CSAP 
activity, wherein CSAP is combined with at least one test compound, and the activity of CSAP in the 
presence of a test compound is compared with the activity of CSAP in the absence of the test 
compound. A change in the activity of CSAP in the presence of the test compound is indicative of a 
compound that modulates the activity of CSAP. Alternatively, a test compound is combined with an in 

10 vitro or cell-free system comprising CSAP under conditions suitable for CSAP activity, and the assay 
is performed. In either of these assays, a test compound which modulates the activity of CSAP may 
do so indirectly and need not come in direct contact with the test compound. At least one and up to a 
plurality of test compounds may be screened. 

In another embodiment, polynucleotides encoding CSAP or their mammalian homologs may be 

15 "knocked ouf ' in an animal model system using homologous recombination in embryonic stem (ES) 
cells. Such techniques are well known in the art and are useful for the generation of animal models of 
human disease. (See, e.g., U.S. Patent No. 5,175,383 and U.S. Patent No. 5,767,337.) For example, 
mouse ES cells, such as ihe mouse 129/SvJ cell line, are derived from the early mouse embryo and 
grown in culture. The ES cells are transformed with a vector containiug the gene of interest dismpted 

20 by a marker gene, e.g., the neomycin phosphotransferase gene (neo; Capecchi, M.R. (1989) Science ' 
244:1288-1292). The vector integrates into the corresponding region of the host genome by 
homologous recombination. Alternatively, homologous recombination takes place using the Cre-loxP 
system to knockout a gene of interest in a tissue- or developmental stage-specific manner (Marth, J.D. 
(1996) Clin. Invest. 97:1999-2002; Wagner, K.U. et al. (1997) Nucleic Acids Res. 25:4323'4330). 

25 Transformed ES cells are identified and microinjected into mouse cell blastocysts such as those from 
the C57BL/6 mouse strain. The blastocysts are surgically transferred to pseudopregnant dams, and 
the resulting chimeric progeny are genotyped and bred to produce heterozygous or homozygous 
strains. Transgenic animals thus generated may be tested with potential therapeutic or toxic agents. 
Polynucleotides encoding CSAP may also be manipulated in vitro in ES cells derived firom 

30 human blastocysts. Human ES cells have the potential to differentiate into at least eight separate cell 
lineages including endodenn, mesoderm, and ectodermal cell types. These cell lineages differentiate 
into, for example, neural cells, hematopoietic lineages, and cardiomyocytes (Thomson, J.A. et al. 
(1998) Science 282:1145-1147). 

I 
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Polynucleotides encoding CSAP can also be used to create 'Taiockin" humanized animals 
(pigs) or transgenic animals (mice or rats) to model human dise.ase. With knockin technology, a region 
of a polynucleotide encoding CSAP is injected into anunal ES cells, and the injected sequence 
integrates into the animal cell genome. Transformed cells are injected into blastulae, and the blastulae 
5 are implanted as described above. Transgenic progeny or inbred lines are studied and treated with 
potential pharmaceutical agents to obtain information on treatment of a human disease. Alternatively, 
a mammal inbred to overexpress CSAP, e.g., by secreting CSAP in its milk, may also serve as a 
convenient source of that protein (Janne, J. et aL (1998) BiotechnoL Annu. Rev. 4:55-74). 
THERAPEUTICS 

10 Chemical and structaral sinoilarity, e.g., in the context of sequences and motifs, exists between 

regions of CSAP and cytoskeleton-assodated proteins. In addition, examples of tissues expressing 
CSAP can be found in Table 6.- Therefore, CSAP appears to play a role in cell proliferative disorders, 
viral infections, and neurological disorders. In the treatment of disorders associated with mcreased 
CSAP expression or activity, it is desirable to decrease the expression or activity of CSAP. In the 

15 treatment of disorders associated with decreased CSAP egression or activity, it is desirable to 
increase the expression or activity of CSAP. 

Therefore, in one embodiment, CSAP or a fragment or derivative thereof may be 
administered to a subject to treat or prevent a disorder associated with decreased expression or 
activity of CSAP. Examples of such disorders include, but are not limited to, a ceD proliferative 

20 disorder such as actinic keratosis, arteriosclerosis, atherosclerosis, bursitis, cirrhosis, hepatitis, mixed 
connective tissue disease (MCTD), myelofibrosis, paroxysmal nocturnal hemoglobinuria, polycytiiemia 
vera, psoriasis, primary thrombocythenria, and a cancer including adenocarcinoma, leukemia, 
lymphoma, melanoma, myeloma, sarcoma, teratocarcinoma, and, in particular, a cancer of the adrenal 
gland, bladder, bone, bone marrow, brain, breast, cervix, gall bladder, ganglia, gastrointestinal tract, 

25 heart, kidney, liver, hmg, muscle, ovary, pancreas, parathyroid, penis, prostate, salivary glands, skin, 
spleen, testis, thymus, thyroid, and uterus; a viral infection such as those caused by adenoviruses 
(acute respiratory disease, pneumonia), arenaviruses (lymphocytic choriomeningitis), bunyaviruses 
(Hantavirus), coronaviruses (pneumonia, chronic bronchitis), hepadnaviruses (hepatitis), herpesviruses 
(herpes simplex virus, varicella-zoster virus, Epstein-Barr virus, cytomegalovirus), flaviviruses (yellow 

30 fever), orthomyxovirases (influenza), papillomaviruses (cancer), paramyxoviruses (measles, mumps), 
picomoviruses (rhinovirus, poliovirus, coxsackie-virus), polyomaviruses (BK virus, JC virus), 
poxviruses (smallpox), reovims (Colorado tick fever), retroviruses (human immunodeficiency virus, 
human T lymphotropic vims), chabdoviruses (rabies), rotaviruses (gastroenteritis), and togaviruses 
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(encephalitis* rubella); and a neuiological disorder sudi as epilepsy, ischemic cerebrovascular disease, 
stroke, cerebral neoplasms, Alzheimer's disease, .Kck's disease, Ebmtington's disease, dementia, 
Parkinson's disease and other extrapyramidal disorders, amyotrophic lateral sclerosis and other motor 
neuron disorders, progressive neural muscular atrophy, retinitis pigmentosa, hereditary ataxias, multiple 

5 sclerosis and other demyelinating diseases, bacterial and viral meningitis, brain abscess, subdural 
enqiyema, epidural abscess, suppurative intracranial dirombophlebitis, myelitis and radiculitis, viral- 
central nervous system disease, a prion disease including kuru, Creutzfeldt-Jakob disease, and 
Gerstmann-Straussler-Scheinker sy&drome, fatal familial insomnia, nutritional and metabolic diseases 
of the nervous system, neurofibromatosis, tuberous sclerosis, cerebelloretinalhemangioblastomatosis, 

10 , encephalotrigeminal syndrome, mental retardation and other developmental disorders of the central 
nervous system, cerebral palsy, neuroskeletal disorders, autonomic nervous system disorders, cramal 
nerve disorders, spinal cord diseases, muscular dystrophy and other neuromuscular disorders, 
peripheral nervous system disorders, dermatomyositis and polymyositis; inherited, metabolic, endocrine, 
and toxic myopathies; myasthenia gravis, periodic paralysis, mental disorders including mood, anxiety, 

15 and schizophrenic disorders, seasonal affective disorder (SAD), akathesia, amnesia, catatonia, diabetic 
neuropathy, tardive dyskinesia, dystonias, paranoid psychoses, postherpetic neuralgia, and Tourette's 
disorder. 

In another embodiment, a vector cajpable of expressing CSAP or a fragment or derivative 
thereof may be administered to a subject to treat or prevent a disorder associated with decreased 
20 ejqpression or activity of CSAP iacluding, but not limited to, those described above. 

In a further embodiment, a composition comprising a substantially purified CSAP in 
conjunction with a suitable pharmaceutical carrier may be admioistered to a subject to treat or prevent 
a disorder associated with decreased expression or activity of CSAP including, but not limited to, those 
provided above. 

25 In still another embodiment, an agonist which modulates the activity of CSAP may be 

administered to a subject to treat or prevent a disorder associated with decreased expression or 
activity of CSAP including, but not limited to, those listed above. 

In a further embodiment, an antagonist of CSAP may be administered to a subject to treat or 
prevent a disorder associated with increased expression or activity of CSAP. Examples of such 

30 disorders include, but are not limited to, those cell proliferative disorders, viral infections, and 
neurological disorders described above. In one aspect, an antibody which specifically binds CSAP 
may be used directly as an antagonist or indirectly as a targeting or delivery medianism for bringing a 
pharmaceutical agent to cells or tissues which egress CSAP. 
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In an additional embodicnent, a vector expressing the complement of the polynucleotide 
encoding CS AP may be administered to a subject to treat or prevent a disorder associated with 
increased expression or activity of CSAP including, but not limited to, those described above. 

In other embodimrats, any of the proteins, antagonists, antibodies, agonists, complementary 

5 sequences, or vectors of the invention may be administeied in combination with other appropriate 
therapeutic agents. Selection of the appropriate agents for use in combination therapy may be made 
by one of ordinary skill in the art, according to conventional pharmaceutical principles. The 
combination of therapeutic agents may act synergisticaUy to effect the treatment or prevention of the 
various disorders described above. Using this approach, one maybe able to achieve therapeutic 

10 efficacy with lower dosages of each agent, thus reducing the potential for adverse side effects. 

An antagonist of CSAP may be produced using methods which are generally known in the 
art. In particular, purified CSAP may be used to produce antibodies or to screen libraries of 
pharmaceutical agents to identify those which specifically bind CSAP. Antibodies to CSAP may also 
be generated using methods that are well known in the art Such antibodies may include, but are not 

15 limited to, polyclonal, monoclonal, chimeric, and single chain antibodies. Fab fragments, and fragments 
produced by a Fab expression library. Neutralisdng antibodies (i.e. , those which inhibit dimer 
formation) are generally preferred for therapeutic use. 

For the production of antibodies, various hosts including goats, rabbits, rats, mice, humans, and 
others may be unmunized by injection with CSAP or with any fragment or oligopeptide thereof which 

20 has iimnunogenic properties. Depending on the host species, various adjuvants may be used to 

increase immunological response. Such adjuvants include, but are not limited to, Freund's, mineral gels 
such as atominum hydroxide, and surface active substances such as lysolecithin, pluronic polyols, 
polyanions, peptides, oil emulsions, KLH, and dinitrophenoL Among adjuvants used inhumans, BCG 
(bacilli Calmette-Guerin) and Corvnebacterium parvum are especially preferable. 

25 It is preferred that the oligopeptides, peptides, or fragments used to induce antibodies to CSAP 

have an amino acid sequence consisting of at least about 5 amino acids, and generally will consist of at 
least about 10 amino acids. It is also pref&rable that these oligopeptides, peptides, or fragments are 
identical to a portion of the amino acid sequence of the natural protein. Short stretches of CSAP 
amino acids may be fused with those of another protein, such as KLH, and antibodies to the chimeric 

30 molecule may be produced. 

Monoclonal antibodies to CSAP may be prepared using any technique which provides for the 
production of antibody molecules by continuous cell lines in culture. These include, but are not limited 
to, the hybridoma technique, the human B-cell hybridoma technique, and the EBV-hybridoma 
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tedmiqae. (See, e.g., Kohler, G. et aL (1975) Nature 256:495-497; Kozbor, D. et aL (1985) J. 

Immunol Mefliods 81:31-42; Cote, R.I et aL (1983) Ptoc. Natl. Acad. ScL USA 80:2026-2030; and 

Cole. S.P. et aL (1984) MoL CeflBiol 62:109-120.) 

In addition, techniques developed for the production of "chimeric antibodies," such as the 
5 splicing of mouse antibody genes to human antibody genes to obtain a molecule with appropriate 

antigen specificity and biological activity, can be used. (See, e.g., Morrison, S.L. et al. (1984) Proc. 

Natl. Acad. Sci. USA 81:6851-6855; Neuberger, M.S. et al. (1984) Nature 312:604-608; and Takeda, 

S. et aL (1985) Nature 314:452-454.) Alternatively, techniques described for the production of single 

chain antibodies may be adapted, using methods known in the art, to produce CSAP-specific single 
10 chain antibodies. Antibodies with related specificity, but of distinct idiotypic composition, may be 

generated by chain shufOing from random combinatorial immunoglobulin libraries. (See, e.g.. Burton, 

D.R. (1991) Proc. NatL Acad. Sd. USA 88:10134-10137.) 

Antibodies may also be produced by inducing iq vivo production in the lymphocyte population 

or by screening immunoglobulin libraries or panels of highly specific binding reagents as disclosed in 
15 flie literatufe. (See, e.g., Orlandi, R. et al. (1989) Proc. Natl. Acad. Sci USA 86:3833-3837; Winter, 

a et al. (1991) Nature 349:293-299.) 

Antibody fragments which contain specific binding sites for CSAP may also be generated. 

For example, such fragments include, but are not limited to, F(ab02 fragments produced by pepsin 

digestion of the antibody molecule and Fab fragments generated by reducing the disulfide bridges of 
20 the F(ab')2 fragments. Alternatively, Fab expression libraries maybe constructed to allow rapid and 

easy identification of monoclonal Fab fragments with the desired specificity. (See, e.g., Huse, W.D. 

et al. (1989) Science 246:1275-1281.) 

Various immunoassays may be used for screening to identify antibodies having the desired 

specificity. Numerous protocols for competitive binding or immunoradiometric assays using either 
25 polyclonal or monoclonal antibodies with established specificities are well known in the art Such 

immunoassays typically involve the measurement of complex formation between CSAP and its 

specific antibody. A.two-site, monoclonal-based immunoassay utilizing monoclonal anfibodies reactive 

to two non-interfering CSAP epitopes is generally used, but a competitive binding assay may also be 

employed (Pound, sugra). 

30 Various methods such as Scatchard analysis in conjunction with radioimmunoassay techniques 

maybe used to assess the affinity of antibodies for CSAP. Affinity is expressed as an association 
constant, K., which is defined as the molar concentration of CSAP-antibody complex divided by the 
molar concentrations of firee antigen and free antibody under equilibrium conditions. The 
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detexmined for a preparation of polyclonal antibodies, which are heterogeneous in their affinities for 
multiple CSAP epitopes, represents the average affinity, or avidity, of the antibodies for CSAP. The 
K. detenxuned for a preparation of monoclonal antibodies, which are monospecific for a particular 
CSAP epitope, represents a true measure of afBnity. High-afBnity antibody preparations with 

5 ranging fix>m about 10^ to 10^^ L/mole are prefeixed for use in innnunoassays in which the CSAP- 
antibody complex must withstand rigorous manipulations. Low-affinity antibody preparations with 
ranging firom about 10^ to 10^ L^ole are preferred for use in immunopurificadon and similar 
procedures which ultimately require dissociation of CSAP, preferably in active form, from fhe antibody 
(Catty, D. (1988) Antibodies. Volume I: A Practical Approach, IRL Press, Washington DC; liddell, 

10 J.E. and A. Cryer (1991) A Practical Guide to Monoclonal Antibodies . John Wiley & Sons, New York 
NY). 

The titer and avidity of polyclonal antibody preparations jtnay be further evaluated to determine 
the quality and suitability of such preparations for certain downstream applications. For example, a 
polyclonal antibody preparation contaimng at least 1-2 mg specific antibody/ml, preferably 5-10 nog 
IS specific antibody/ml, is generally employed in procedures requiring precipitation of (3SAP-antibody 
complexes. Procedureis for evaluating antibody specificity, titer, and avidi^, and guidelines for 
antibody quality and usage in various applications, are generally available. (See, e.g., Catty, supra, and 
Coligan et al. supra .^ 

In another embodiment of the invention, the polynucleotides encoding CSAP, or any fi*agment 
20 or complement thereof, may be used for therapeutic purposes. In one aspect, modifications of gene 
expression can be achieved by designing complementary sequences or antisense molecules (DNA, 
RNA, PNA, or modified oligonucleotides) to the coding or regulatory regions of the gene encoding 
CSAP. Such technology is weD known in the art, and antisense oligonucleotides or larger firagments 
can be designed from various locations along the coding or control regions of sequences encoding 
25 CSAP. (See, e.g., Agrawal, S., ed. (1996) Antisense Therapeutics . Humana Press Inc., Totawa NJ.) 
In therapeutic use, any gene delivery system suitable for introduction of the antisense 
sequences into appropriate target cells can be used. Antisense sequences can be delivered 
intraceOularly in the form of an expression plasmid which, upon transcription, produces a sequence 
complementary to at least a portion of the celhilar sequence encoding the target protein. (See, e.g., 
30 Slater, J.E. et al. (1998) J. Allergy Clin. ImmunoL 102(3):469-475; and Scanlon, K.J. et al. (1995) 
9(13):1288-1296.) Antisense sequences can also be introduced intraceOularly through die use of viral 
vectors, such as retrovirus and adeno-associated virus vectors. (See, e.g., Miller, A.D. (1990) Blood 
76:271; Ausubel, supra ; Uckert, W. and W. Walther (1994) Pharmacol. Ther. 63(3):323-347.) Other 
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gene deUvery mechamsms include liposome-derived systems, artificial viral envelopes, and other 
systems known in ftie art (See, e.g., Rossi, J J. (1995) Br. Med. BnE 51(l):217-225; Boado, R.J. et 
al. (1998) J. Phaim. Sci. 87(11):1308-1315; and Morris, M.C. et al. (1997) Nucleic Adds Res. 
25(14):2730-2736.) 

5 In another embodiment of the invention, polynucleotides encoding CS AP may be used for 

somatic or germline gene therapy. Gene therapy may be performed to (i) correct a genetic deficiency 
(e.g., in the cases of severe combined immunodeficiency (SCID)-Xl disease characterized by X- 
linked inheritance (Cavazzana-Calvo, M. et aL (2000) Science 288:669-672), severe combined 
immunodeficiency syndrome associated with an inherited adenosine deaminase (ADA) deficiency 

10 (Blaese, R.M. et aL (1995) Science 270:475-480; Bordignon, C et al (1995) Science 270:470-475), 
cystic fibrosis (Zabner, J. et al. (1993) Cell 75:207-216; Crystal, RG. et al. (1995) Hum. Gene 
Therapy 6:643-666; Crystal, R.G. et al. (1995) Hum. Gene Therapy 6:667-703), lhalassamias, familial 
hypercholesterolemia, and hemophilia resulting from Factor VIII or Factor IX deficiencies (Crystal, 
R.G. (1995) Science 270:404-410; Verma, I.M. and N. Somia (1997) Nature 389:239-242)), (n) 

15 express a conditionally lethal gene product (e.g., in the case of cancers which result from unregulated 
cell proliferation), or (iii) express a protein which affords protection against intracellular parasites (e.g., 
against human retroviruses, such as human immunodeficiency virus (HIV) (Baltimore, D. (1988) 
Nature 335:395-396; Poeschla, E. et al. (1996) Proc. Natl. Acad. Sci. USA 93:11395-11399), hepatitis 
B or C virus (HBV, HCV); fungal parasites, such as Candida albicans and Paracoccidioides 

20 brasiliensis : and protozoan parasites such as Plasmodium falciparum and Trypanosoma cruzi) . In the 
case where a genetic deficiency in CS AP e;q)ression or regulation causes disease, the expression of 
CS AP from an appropriate population of transduced cells may alleviate the clinical manifestations . 
caused by the genetic deficiency. 

In a further embodiment of the invention, diseases or disorders caused by deficiencies in 

25 CSAP are treated by constructing mammalian expression vectors encoding CSAP and introducing 
these vectors by mechanical means into CSAP-deficient cells. Mechanical transfer technologies for 
use with cells in vivo or ex vitro include (i) direct DNA microinjection into individual cells, (ii) ballistic 
gold particle delivery, (iii) liposome-mediated transfection, (iv) receptor^mediated gene transfer, and 
(v) the use of DNA transposons (Morgan, R.A. and W.F. Anderson (1993) Annu. Rev. BiochenL 

30 62:191-217; Ivies, Z. (1997) Cell 91:501-510; Bonlay, J-L. and H R&ipon (1998) Curr. Opin. 
BiotedmoL 9:445-450). 

Expression vectors that may be effective for the expression of CSAP include, but are not 
limited to, the PCDNA 3.1. EPITAG, PRCCMV2, PREP, PVAX, PCR2-TOP0TA vectors 
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(Invitrogen, Carlsbad CA). PCMV-SCRIPT, PCMV-TAG, PEGSH/PERV (Stratagene, La Jolla CA), 
and PTET-OEF, PTET-ON, FrRE2, FrRE2-LUC, PTK-HYG (Qontech, Palo Alto CA). CSAP 
may be expressed using (i) a constitutively active promoter, (e.g., from cytomegalovirus (CMV), Rous 
sarcoma virus (RSV), SV40 virus, thymidine kmase (TK), or P-acdn genes), (ii) an inducible promoter 

5 (e.g., the tetracycline-regulated promoter (Gossen, M. and BL Bujard (1992) Proc. NatL Acad. Sd. 
USA 89:5547-5551; Gossen, M. et al, (1995) Science 268:1766-1769; Rossi, RM.V. and H.M. Blau 
(1998) Curr. Opin. Biotedmol. 9:451-456), commercially available iti the T-REX plasmid (Invitrogen)); 
the ecdysone-inducible promoter (available in the plasmids PVGRXR and FIND; Invitrogen); the 
FK506/rapamycin inducible promoter; or the RU486/mifepristone inducible promoter (Rossi, RM.V. 

10 and H.M. Blau, supra^ \ or (iii) a tissue-specific promoter or the native promoter of the endogenous 
gene encoding CSAP from a normal individual. 

CommerciaUy available liposome transformation kits (e.g., the PERFECT LIPID 
TRANSFECnON KTT, available from Invitrogen) allow one with ordinaiy skin in the art to deliver 
polynucleotides to target cells in culture and require minimal effort to optimize experimental 

IS parameters. In the altemadve, transformation is performed using the calcium phospMte method 
(Graham, FX. and A.J. Bb (1973) Virology 52:456-467), or by electroporation (Neumann, E. et al. 
(1982) EMBO J. 1:841-845). The introduction of DNA to primary cplls requires modification of these 
standardized mammalian transfection protocols. 

In another embodiment of the invention, diseases or disorders caused by genetic defects with 

20 respect to CSAP expression are treated by constracting a retrovirus vector consisting of (i) the 

polynucleotide encodiog CSAP under the control of an independent promoter or the retrovirus long 
terminal repeat (LTR) promoter, (ii) appropriate RNA packaging signals, and (iii) a Rev-responsive 
element (RRE) along with additional retrovims cw-acting RNA sequences and coding sequences 
required for efficient vector propagation. Retrovirus vectors (e.g., PFB and PFBNEO) are 

25 commercially available (Stratagene) and are based on published data (Riviere, I. et al. (1995) Proc. 
NatL Acad. Sci. USA 92:6733-6737), mcorporated by reference herein. The vector is propagated in 
an appropriate vector producing cell line (VPCL) that expresses an envelope gene with a tropism for 
receptors on the target cells or a promiscuous envelope protein such as VSVg (Armentano, D. et al. 
(1987) J. Virol. 61:1647-1650; Bender, M.A. et al. (1987) J. ViroL 61:1639-1646; Adam, M.A. and 

30 A.D. Miner (1988) J. Virol. 62:3802-3806; Dull, T. et al. (1998) J. ViroL 72:8463-8471; Zufferey, R. ei 
aL (1998) J. Virol 72:9873-9880). U.S. Patent No. 5^10,434 to Rigg CMethod for obtaining 
retrovirus packaging ceU lines producing high transducing efficiency retroviral supernatant") discloses 
a mediod for obtaining retrovirus packaging cell lines and is hereby incorporated by reference. 
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Propagation of retrovirus vectors, transduction of a population of cells (e.g., T-cells), and the 
return of transduced cells to a patient are procedures well known to persons skilled in fhe art of gene 
therapy and have been well documented (Ranga, U. et al. (1997) J. ViroL 71:7020-7029; Bauer, G. et 
al (1997) Blood 89:2259-2267; Bon^adi, M.L. (1997) J. Virol. 71:4707-4716; Ranga, U. et aL (1998) 

5 Proc. NatiL Acad. Sci. USA 95:1201-1206; Su, L. (1997) Blood 89:2283-2290). 

In the alternative, an adenovirus-based gene therapy delivery system is used to deliver 
polynucleotides encoding CS AP to cells which have one or more genetic abnormalities with respect to 
the expression of CSAP. The construction and packaging of adenovirus-based vectors are well 
known to those with ordinary skill in the art Replication defective adenovirus vectors have proven to 

10 be versatile for importing genes encoding immunoregulatory proteins into intact islets in the pancreas 
(Csete, M.E. et al. (1995) Transplantation 27:263-268). Potentially useful adenoviral vectors are 
described in U.S. Patent No. 5,707,61 8 to Armentano ("Adenovirus vectors for gene fherap/')> 
hereby incorporated by reference. For adenoviral vectors, see also Antinozzi, P. A. et al. (1999) 
Annu. Rev. Nutr. 19:511-544 and Veima, IM. and N. Somia (1997) Nature 18:389:239-242, both 

15 incorporated by reference herein. 

In another alternative, a herpes-based, gene therapy delivery system is used to deliver 
polynucleotides encoding CSAP to target cells which have one or more genetic abnormalities with 
respect to the expression of CSAP. The use of herpes simplex virus (HS V)-based vectors may be 
especially valuable for introducing CSAP to cells of the central nervous system, for which HSV has a 

20 tropism. Hie constmction and packaging of herpes-based vectors are well known to those with 

ordinary skill in the art A replication-competent herpes simplex virus (HSV) type 1 -based vector has 
been used to deliver a reporter gene to the eyes of primates (Liu, X. et al. (1999) Exp. Eye Res. 
169:385-395). The constraction of a HSV-1 virus vector has also been disclosed in detail in U.S. 
Patent No. 5,804,413 to DeLuca ('Herpes simplex virus strains for gene transfer"), which is hereby 

25 mcorporated by reference. U.S. Patent No. 5,804,413 teaches the use of recombinant HSV d92 
which consists of a genome containing at least one exogenous gene to be transferred to a ceU under 
the control of the appropriate promoter for purposes including human gene therapy. Also taught by 
this patent are the construction and use of recombinant HSV strains deleted for ICP4, ICP27 and 
ICP22. For HSV vectors, see also Coins, W.R et al. (1999) J. Virol. 73:519-532 and Xu. H. et al. 

30 (1994) Dev. BioL 163:152-161, hereby incorporated by reference. The manipulation of cloned 
herpesvirus sequences, the generation of recombinant virus following the transfection of multiple 
plasmids containing different segments of the large herpesvirus genomes, the growth and propagation 
of herpesvirus, and the infection of cells with herpesvims are techniques well known to those of 
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ordinary skill in the art. 

In another alternative, an alphavims (positive, single-stranded RNA virus) vector is used to 
deliver polynucleotides encoding CSAP to target cells. The biology of the prototypic alphavims, 
Semliki Forest Virus (SFV), has been studied extensively and gene transfer vectors have beaa based 

5 on the SFV genome (Garoff, H. and K.-J. Li (1998) Curr. Opin. BiotechnoL 9:464-469). During 
alphavims BNA replication, a subgenomic KNfA is generated that normally encodes the viral capsid 
proteins. This subgenomic RNA replicates to higher levels than the ftill length genomic RNA, 
resulting in the overproduction of capsid proteins relative to the viral proteins with enzymatic activity 
(e.g., protease and polymerase). Similarly, inserting the coding sequence for CSAP into the alphavims 

10 genome in place of the capsid-coding region results in the production of a large number of C^AP- 
coding RNAs and the synthesis of high levels of CSAP in vector transduced cells. While alphavims 
iofection is typically associated with cell lysis within a few days, the abihty to establish a persistent 
infection in hamster normal kidney cells (BHK-21) with a variant of Sindbis viras (SIN) indicates that 
the lytic replication of alphavimses can be altered to suit the needs of the gene therapy application 

15 (Dryga, S.A. et al. (1997) Virology 228:74-83). The wide host range of alphavimses will allow the 
introduction of CSAP into a variety of cell types. The specific transduction of a subset of cells in a 
population may require the sorting of celk prior to transduction. The methods of manipulating 
infectious cDNA clones of alphavimses, performing alphavims cDNA and RNA transfections, and 
performing alphavims infections, are well known to those with ordinary skill in the art 

20 Oligonucleotides derived from the transcription initiation site, e.g., between about positions -10 

and +10 from the start site, may also be employed to inhibit gene expression. Similarly, inhibition can 
be achieved using triple helix base-pairing methodology. Triple helix pairing is useful because it causes 
inhibition of the ability of the double helix to open sufficienfly for the binding of polymerases, 
transcription factors, or regulatory molecules. Recent therapeutic advances using triplex DNA have 

25 been described in the literature, (See, e.g.. Gee, J.E. et al. (1994) in Ruber, B.E. and B.I. Carr, 
Molecular and Immunologic AtJproaches . Futura Publishing, Mt Kisco NY, pp. 163-177.) A 
complementary sequence or antisense molecule may also be designed to block translation of mORNA 
by preventing the transcript from binding to ribosomes. 

Ribozymes, enzymatic RNA molecules, may also be used to catalyze the specific cleavage of 

30 RNA. The mechanism of ribozyme action involves sequence-specific hybridization of the ribozyme 
molecule to complementary target RNA, followed by endonucleolytic cleavage. For example, 
engmeered hammeifaead motif ribozyme molecules may specifically and efficiently catalyze 
endonucleolytic cleavage of sequences encoding CSAP. 
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Specific ribozyme cleavage sites wi&in any potential KNA target are initiaUy identified by 
scanning the target molecule for ribozyme cleavage sites, including the following sequences: GUA, 
GUU, and GUC. Once identified, short KNA sequences of between IS and 20 ribonucleotides, 
corresponding to the region of the target gene containing the cleavage site, may be evaluated for 

5 secondary structural features which may render the oligonucleotide inoperable. The suitability of 
candidate targets may also be evaluated by testing accessibility to hybridization with complementary 
oligonucleotides using ribonuclease protection assays. 

Complementary ribonucleic acid molecules and ribozymes of the invention may be prepared 
by any method known in the art for the synthesis of nucleic acid molecules. These include techniques 

10 for chemically synthesizing oligonucleotides such as solid phase phosphoramidite chemical synthesis. 
Alternatively, KNA molecules may be generated by in vitro and mvivo transcription of DNA 
sequences encoding CSAP. Such DNA sequences may be incorporated into a wide variety of 
vectors with suitable KNA polymerase promoters such as T7 or SP6. Alternatively, these cDNA 
constructs that synthesize complementary RNA, constitutively or inducibly, can be introduced into cell 

15 lines, cells, or tissues. 

RNA molecules may be modified to increase intracellular stability and half-life. Possible 
modifications include, but are not limited to, the addition of flankiag sequences at the 5' and/or 3' ends 
of the molecule, or the use of phosphorothioate or T O-methyl rather than phosphodiesterase linkages 
within the backbone of the molecule. This concept is inherent in the production of PNAs and can be 

20 extended in all of these molecules by the inclusion of nontraditional bases such as inosine, queosine, 
and wybutosine, as well as acetyl-, methyl-, thio-, and similarly modified forms of adenine, cytidme, 
guanine, thymine, and uridine which are not as easily recognized by endogenous endonucleases. 

An additional embodiment of the invention encompasses a method for screening for a 
compound which is effective in altering expression of a polynucleotide encoding CSAP. Compounds 

25 which may be effective m altering expression of a specific polynucleotide may include, but are not 
limited to, oligonucleotides, antisense oligonucleotides, triple helix-forming oligonucleotides, 
transcription factors and other polypeptide transcriptional regulators, and non-macromolecular 
chemical entities which are capable of interacting with specific polynucleotide sequences. Effective 
compounds may alter polynucleotide expression by acting as either inhibitors or promoters of 

30 polynucleotide expression. Thus, in the treatment of disorders associated with increased CSAP 
expression or activity, a compound which specifically inhibits e3q)ression of the polynucleotide 
encoding CSAP may be therapeutically useful, and in the treatment of disorders associated wifli 
decreased CSAP expression or activity, a compound which specifically promotes expression of the 
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polynucleotide encoding CS AP may be flierapeutically useful 

At least one, and up to a plurality, of test compounds may be screened for eifecttveness in 
altering expression of a specific polynucleotide. A test compound may be obtained by any method 
commonly known in the art, including chemical modification of a compoundknown to be effective in 

5 altering polynucleotide expression; selection from an existing, commercially-available or proprietary 
library of naturally-occurring or non-natural diemical compounds; rational design of a compound 
based on chemical and/or structural properties of the target polynucleotide; and selection from a 
library of chemical compounds created cotaibinatorially or randomly. A sample comprising a 
polynucleotide encoding CS AP is exposed to at least one test compound thus obtained. The sample 

10 may comprise, for example, an intact or permeabilized cell, or an in vitro cell-fi^e or reconstituted 
biochemical systenou Alterations in the expression of a polynucleotide encoding CSAP are assayed by 
any method commonly known in the art Typically, the expression of a specific nucleotide is detected 
by hybridization with a probe having a nucleotide sequence complementary to the sequence of the 
polynucleotide encoding CSAP. The amount of hybridization may be quantified, thus forming the 

15 basis for a comparison of the expression of tiie polynucleotide both with and without exposure to one. 
or more test compounds. Detection of a change in the expression of a polynucleotide exposed to a 
test compound indicates that the test compound is effective in altering the expression of the . 
polynucleotide. A screen for a compound effective in altering expression of a specific polynucleotide, 
can be carried out, for example, using a Schizosaccharomvces pombe gene expression system (Atkins, 

20 D. et aL (1999) U.S. Patent No. 5,932,435; Amdt, GM. et al. (2000) Nucleic Acids Res. 28:E15) or a 
human cell line such as HeLa cell (Clarke, M.L. et al. (2000) Biochem. Biophys. Res. Commim. 
268:8-13). A particular embodiment of the present invention involves screening a combinatorial library 
of oligonucleotides (such as deoxyribonucleotides, ribonucleotides, peptide nucleic acids, and modified 
oligonucleotides) for antisense activity against a specific polynucleotide sequence (Bruice, T.W. et al. 

25 (1997) U.S. Patent No. 5,686,242; Bruice, T.W. et aL (2000) U.S. Patent No. 6,022,69 1). 

Many methods for introducing vectors into cells or tissues are available and equally suitable 
for use in vivo , in vitro , and ex vivo . For ex vivo therapy, vectors may be introduced into stem cells 
taken from the patient and clonaUy propagated for autologous transplant back into that same patient 
Delivery by transfection, by liposome injections, or by polycationic amino polymers may be achieved 

30 using methods which are well known in the art (See, e.g., Goldman, C.K. et al. (1997) Nat 
BiotechnoL 15:462-466.) 

Any of the therapeutic methods described above may be applied to any subject in need of 
such therapy, including, for example, mammals such as humans, dogs, cats, cows, horses, rabbits, and 
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monkeys. 

An additional embodiment of the invention relates to the adnoinistration of a composition which 
generally comprises an active ingredient formulated with a pharmaceatically acceptable excipient 
Exdpients may inchide, for example, sugars, starches, celluloses, gums, and proteins. Various 
5 formulations are commonly known and are thoroughly discussed in the latest edition of Remington's 
Pharmaceutical Sciences (Maack Publishing, Easton PA). Such compositions may consist of CSAP, 
antibodies to CSAP, and mimetics, agonists, antagonists, or inhibitors of CSAP. 

The compositioiis utilized m this invention miay be administered by any number o 
including, but not limited to, oral, intravenous, intramuscular, intra-arterial, intramedullary, intrathecal, 
10 intraventricular, pulmonary, transdermal, subcutaneous, intraperitoneal, intranasal^ enteral, topical, 
sublingual, or rectal means. 

Compositions for pulmonary administration may be prepared hi liquid or dry powder form. 
These compositions are generally aerosolized immediately prior to inhalation by the patient. In the 
case of small molecules (e.g. traditional low molecular weight organic dmgs), aerosol delivery of fast- 
is acting formulations is well-known in the art In the case of macromolecules (e.g. larger peptides and . 
proteins), recent developments in the field of pulmonary delivery via the alveolar region of the lung 
have enabled the practical delivery of drugs such as insulin to blood circulation (see, e.g., Patton, l.S: 
et al., U.S. Patent No. 5,997,848). Pulmonary delivery has the advantage of administration without 
needle injection, and obviates the need for potentially toxic penetration enhancers. 
20 Compositions suitable for use in the invention include compositions wherein the active 

ingredients are contained in an effective amount to achieve the intended purpose. The determination 
of an effective dose is well within the capability of those skilled in. the art. 

Specialized forms of compositions maybe prepared for direct intracellular delivery of 
macromolecules comprising CSAP or fragments thereof. For example, liposome preparations 
25 containing a cell-impermeable macromolecule may promote ceD fusion and intracellular delivery of the 
macromolecule. Alternatively, CSAP or a fragment thereof may be joined to a short cationic N- 
terminal portion from the HDTV Tat-1 protein. Fusion proteins thus generated have been found to 
transduce into the cells of all tissues, including the brain, in a mouse model system (Schwarze, S.R. et 
aL (1999) Science 285:1569-1572). 
30 For any compound, the therapeutically effective dose can be estimated initially either in cell 

culture assays, e.g., of neoplastic cells, or in animal models such as mice, rats, rabbits, dogs, monkeys, 
or pigs. An animal model may also be used to determine the appropriate concentration range and 
route of administration. Such information can then be used to determine useful doses and routes for 
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administration in humans. 

A flierapeutically effective dose refers to that amount of active ingredient, for example CS AP 
or fragments thereof, antibodies of CSAP, and agonists, antagonists or inhibitors of CSAP, which 
ameliorates the symptoms or condition. Iherapeutic efGcacy and toxicity may be determined by 

5 standard pharmaceutical procedures in cell cultures or with experimental animals, such as by 
calculating the ED50 (the dose therapeutically effisctive in 50% of the population) or LD50 (the dose 
lethal to 50% of the population) statistics. The dose ratio of toxic to therapeutic effects is the 
therapeutic index, which can be caressed as the LDsq/EDsq ratio. Compositions which exhibit large 
therapeutic indices are preferred. Ihe data obtamed from cell culture assays and animal studies are 

10 used to formulate a range of dosage for human use. The dosage contained in such compositions is 
preferably within a range of circulating concentrations that includes the ED50 with httle or no toxicity. 
The dosage varies within this range depending upon the dosage form employed, the sensitivity of the 
patient, and the route of administration. 

The exact dosage will be determined by the practitioner, in of factors related to the 

15 subject requiring treatment Dosage and administration are adjusted to provide sufficient levels of the 
active moiety or to Tpait^^ai^i the desired effect Factors which may be taken into account include the 
severity of the disease state, the general health of the subject, the age, weight, and gender of the 
subject, time and frequency of administration, drug combination(s), reaction sensitivities, and response . 
to therapy. Long-acting compositions maybe administered every 3 to 4 days, every week, or 

20 biweekly depending on the half-life and clearance rate of the particular formulation. 

Normal dosage amounts may vary from about 0.1 fig to 100,000 /zg, up to a total dose of 
about 1 gram, depending upon the route of administration. Guidance as to particular dosages and 
methods of delivery is provided in the literature and generally available to practitioners in the art. 
Those skilled in the art wfll employ different formulations for nucleotides than for proteins or their 

25 inhibitors. Similarly, dehvery of polynucleotides or polypeptides will be specific to particular cells, 
conditions, locations, etc. 
DIAGNOSTICS 

In another embodiment, antibodies which specifically bind AP may be used for the 
diagnosis of disorders characterized by expression of CSAP, or in assays to monitor patients being 
30 treated with CSAP or agonists, antagonists, or inhibitors of CSAP. Antibodies useful for diagnostic 
purposes may he prepared* in the same manner as described above for therapeutics. Diagnostic 
assays for CSAP include methods which utilize the antibody and a label to detect CSAP in human 
body fluids or in extracts of cells or tissues. The antibodies may be used with or without modification, 

57 



wo 02/053719 



PCT/US02/00178 



and may be labeled by covalent or non-covalent attacbment of a reporter molecule. A wide variety of 
reporter molecules, several of which are described above, are known in fhe art and may be used. 

A variety of protocols for measuring CSAP, including FJJSAs, RIAs, and FACS, are known 
in the art and provide a basis for diagnosing altered or abnomial levels of CSAP e3q>ression. Nonnal 

5 or standard values for CSAP expression are established by combining body fluids or ceU extracts 
taken from nonnal mammalian subjects, for example, human subjects, with antibodies to CSAP under 
conditions suitable for complex formation. The amount of standard complex fomiation may be 
quantitated by various methods, such as photometric means. Quantities of CSAP caressed in 
subject, control, and disease samples from biopsied tissues are compared with the standard values. 

10 Deviation between standard and subject values establishes the parameters for diagnosing disease. 

In another embodiment of the invention, the polynucleotides encoding CSAP may be used for 
diagnostic purposes. The polynucleotides which may be used include oligonucleotide sequences, 
complementary RNA and DNA molecules, and PNAs. The polynucleotides may be nsed to detect 
and quantify gene expression in biopsied tissues in which expression of CSAP may be correlated wifli 

15 disease. The diagnostic assay may be used to determine absence, presence, and excess expression of 
CSAP, and to monitor regulation of CSAP levels during therapeutic intervention. 

In one aspect, hybridization with PCR probes which are capable of detecting polynucleotide 
sequences, including genomic sequences, encoding CSAP or closely related molecules may be used to 
identify nucleic acid sequences which encode CSAP. The specificity of the probe, whether it is made 

20 from a highly specific region, e.g., the 5 'regulatory region, or from a less specific region, e.g., a 
conserved motif, and the stringency of the hybridization or amplification will determine whether the 
. probe identifies only naturally occurring sequences encoding CSAP, allelic variants, or related 
sequences. 

Probes may also be used for the detection of related sequences, and may have at least 50% 
25 sequence identity to any of the CSAP encoding sequences. The hybridization probes of the subject 
invention may be DNA or KNA and may be derived from the sequence of SEQ ID NO:19-36 or from 
genomic sequences including promoters, enhancers, and iutrons of the CSAP gene. 

Means for producing specific hybridization probes for DNAs encoding CSAP include the 
cloning of polynucleotide sequences encoding CSAP or CSAP derivatives into vectors for the 
30 production of mRNA probes. Such vectors are known in the art, are commercially available, and may 
be used to synthesize SNA probes in vitro by means of the addition of the appropriate KNA 
polymerases and the appropriate labeled nucleotides. Hybridization probes may be labeled by a 
variety of reporter groups, for exanq)le, by radionuclides such as or ^S, or by enzymatic labels. 
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such as alkaline phosphatase coupled to the probe via avidijo/biotin coupling systems, and the like. 

Polynucleotide sequences encoding CSAP may be used for the diagnosis of disorders 
associated with egression of CSAP. Examples of sudi disorders include, bnt are not limited to, a cell 
proliferative disorder such as acdnic keratosis, artetiosclerosis, atherosclerosis, bursitis, cinhosis, 

5 hepatitis, mixed connective tissue disease (MCTD), myelofibrosis, paroxysmal nocturnal 

hemoglobinuria, polycythemia vera, psoriasis, primary thrombocythemia, and a cancer including 
adenocarcinoma, leukemia, lymphoma, melanoma, myeloma, sarcoma, teratocarcinoma, and, in 
particular, a cancer of the adrenal gland, bladder, bone, bone marrow, brain, breast, cervix, gall 
bladder, ganglia, gastrointestinal tract, heart, kidney, liver, lung, muscle, ovary, pancreas, parathyroid, 

10 penis, prostate, salivary glands, skin, spleen, testis, thymus, thyroid, and uterus; a viral infection such as 
those caused by adenoviruses (acute respiratory disease, pneumonia), arenaviruses (lymphocytic 
choriomeningitis), bunyaviruses (Hantavirus), coronaviruses (pneumonia, chronic brondbitis), 
hepadnaviruses (hepatitis), herpesviruses (herpes simplex virus, varicella-zoster vims, Epstein-Barr 
virus, cytomegaloviras), flavivirases (yellow fever), orthomyxoviruses (influenza), papillomaviruses 

IS (cancer), paramyxoviruses (measles, mumps), picomoviruses (rhinovirus, poliovims, coxsackie-virus), 
polyomaviruses (BK virus, JC virus), poxviruses (smallpox), reovirus (Colorado tick fever), 
retroviruses (human iimnunodeficiency virus, human T lymphotropic virus), rhabdoviruses (rabies), 
rotaviruses (gastroenteritis), and togaviruses (encephalitis, mbella); and a neurological disorder sudi as 
epilepsy, ischemic cerebrovascular disease, stroke, cerebral neoplasms, Alzheimer's disease, Pick*s 

20 disease, Huntington's disease, dementia, Parkinson's disease and other extrapyramidal disorders, 
amyotrophic lateral sclerosis and other motor neuron disorders, progressive neural muscular atrophy, 
retinitis pigmentosa, hereditary ataxias, multiple sclerosis and other demyeliuating diseases, bacterial 
and viral meningitis, brain abscess, subdural empyema, epidural abscess, suppurative intracranial 
flirombophlebitis, myelitis and radiculitis, viral central nervous system disease, a prion disease including 

25 kuru, Creutzfeldt-Jakob disease, and Gerstmann-Straussler-Scheinker syndrome, fatal familial 
insomnia, nutritional and metabolic diseases of the nervous system, neurofibromatosis, tuberous 
sclerosis, cerebelloretinal hemangioblastomatosis, encephalotrigeminal syndrome, mental retardation 
and other developmental disorders of the central nervous system, cerebral palsy, neuroskeletal 
disorders, autonomic nervous system disorders, cranial nerve disorders, spinal cord diseases, muscular 

30 dystrophy and other neuromuscular disorders, peripheral nervous system disorders, dermatomyositis 
and polymyositis; inherited, metabolic, endocrine, and toxic myopathies; myasthenia gravis, periodic 
paralysis, mental disorders including mood, anxiety, and schizophrenic disorders, seasonal affective 
disorder (SAD), akatfaesia, anmesia, catatonia, diabetic neuropathy, tardive dyskinesia, dystonias, 
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paranoid psychoses, postherpetic neuralgia, and Tourette's disorder. Hie polynucleotide sequences ' 
encoding CSAP maybe used in Southern or northern analysis, dot blot, or other membrane-based 
technologies; in PGR technologies; in dipstick, pin, and multifomiat ELISA-like assays; and in 
microarrays utilizmg fluids or tissues from patients to detect altered CSAP expression. Such 

5 qualitative or quantitative methods are well known in the art 

In a particular aspect, the nucleotide sequences encoding CSAP may be useful in assays that 
detect the presence of associated disorders, particularly those mentioned above. The nucleotide 
sequences encoding CSAP may be labeled by standard methods and added to a Said or tissue sanq>le 
from a patient under conditions suitable for the formation of hybridization complexes. After a suitable 

10 incubation period, the sample is washed and the signal is quantified and compared with a standard 
value. If the amount of signal in the patient sample is significantly altered in comparison to a control 
san^le then the presence of altered levels of nucleotide sequences encoding CSAP in the sample 
indicates the presence of the associated disorder. Such assays may also be used to evaluate the 
efficacy of a particular therapeutic treatment regimen in animal studies, in clinical trials, or to monitor 

15 the treatment of an individual patient 

In order to provide a basis for the diagnosis of a disorder associated with expression of CSAP, 
a normal or standard profile for expression is established. This may be accomplished by combining 
body fluids or cell extracts taken from normal subjects, either animal or human, with a sequence, or a 
fragment thereof, encoding CSAP, under conditions suitable for hybridization or amplification. 

20 Standard hybridization may be quantified by comparing the values obtained from normal subjects with 
values from an experiment in which a known amount of a substantially purified polynucleotide is used. 
Standard values obtained in this manner may be compared with values obtained from samples from 
patients who are symptomatic for a disorder. Deviation from standard values is used to establish the 
presence of a disorder. 

25 Once the presence of a disorder is established and a treatment protocol is initiated, 

hybridization assays may be repeated on a regular basis to determine if the level of expression in the 
patient begins to approximate that which is observed in the normal subject. The results obtained from . 
successive assays may be used to show the efficacy of treatment over a period ranging from several 
da)^ to months. 

30 With respect to cancer, the presence of an abnormal amount of transcript (either under- or 

overexpressed) in biopsied tissue from an individual may indicate a predisposition for the development 
of the disease, or may provide a means for detecting the disease prior to the appearance of actual 
clniical symptoms. A more definitive diagnosis of this type may allow health professionals to employ 
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pieventative measures or aggressive treatment earlier thereby preventing the development or further 
progression of the cancer. 

Additional diagnostic uses for oligonucleotides designed from the sequences encoding CSAP 
may involve the use of PGR. These oligomers may be chemically synthesized, generated 

5 enzymaticaUy, or produced in vitro . Oligomers will preferably contain a fragment of a polynucleotide 
encoding CSAP, or a fragment of a polynucleotide complementary to the polynucleotide encoding 
CSAP, and will be employed tmder optimized conditions for identification of a specific gene or 
condition. Oligomers may also be employed under less stringent conditions for detection or 
quantification of closely related DNA or SNA sequences. 

10 ' In a particular aspect, oligonucleotide primers derived fi*om ftie polynucleotide sequences 

encoding CSAP may be used to detect single iiucleotide polymorphisms (SNPs). SNPs are 
substitutions, insertions and deletions that are a frequent cause of inherited or acquired genetic disease 
in humans. Methods of SNP detection include, but are not limited to, single-stranded conibnnation 
polymorphism (SSCP) and fluorescent SSCP (fSSCP) methods. In SSCP, oligonucleotide primers 

15 derived firom the polynucleotide sequences encoding CSAP are used to amplify DNA using the 

polymerase chain reaction (PGR). The DNA may be derived, for example, from diseased or nonnal 
tissue, biopsy samples, bodUy fluids, and the like. SNPs in the DNA cause differences in the 
secondary and tertiary structures of PCR products in single-stranded form, and these differences are 
detectable using gel electrophoresis in non-denaturing gels. In fSCCP, the oligonucleotide primers are 

20 fhiorescentiy labeled, which allows detection of the amplimers in high-throughput equipment such as 
DNA sequencing machines. Additionally, sequence database analysis methods, termed in silico SNP 
(isSNP), are capable of identifying polymorphisms by comparing the sequence of individual 
overlapping DNA fragments which assemble into a common consensus sequence. These computer- 
based methods filter out sequence variations due to laboratory preparation of DNA and sequencing 

25 errors using statistical models and automated analyses of DNA sequence chromatograms. In the 
alternative, SNPs may be detected and characterized by mass spectrometry using, for example, the 
high throughput MASSARRAY system (Sequenom, Inc., San Diego CA). 

Methods which may also be used to quantify the expression of CSAP include radiolabeling or 
biotinylating nucleotides, coamplification of a control nucleic acid, and interpolating results from 

30 standard curves. (See, e.g., Melby, P.C. et al (1993) J. Lmnunol. Methods 159:235-244; Duplaa, C. 
et al. (1993) Anal. Biochem. 212:229-236.) The speed of quantitation of multiple samples maybe 
accelerated by running the assay in a high-throughput format where the oligomer or polynucleotide of 
interest is presented in various dilutions and a spectrophotometric or colorimetric response gives rapid 
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quantitation. 

In iuither embodiments, oligonucleotides or longer firagments derived from any of the 
polynucleotide sequences described herein may be used as elements on a microarray. The microarray 
can be used in transcript imaging techniques which monitor the relative expression levels of large 

5 numbers of genes simultaneously as described below. The microarray may also be used to identify 
genetic variants, mutations, and polymorphisms. This information may be used to determine gene 
function, to understand the genetic basis of a disorder, to diagnose a disorder, to monitor 
progression/regression of disease as a function of gene expression, and to develop and monitor the 
activities of therapeutic agents in the treatment of disease. In particular, this information may be used 

10 to develop a pharmacogenomic profile of a patient in order to select the most appropriate and effective 
treatment regunen for that patient For example, therapeutic agents which are highly eiBectrve and 
display the fewest side effects may be selected for a patient based on his/her pharmacogenomic 
profile. 

In another embodiment, CSAP, firagments of CSAP, or antibodies specific for CSAP may be 
15 used as elements on a microarray. The microarray may be used to monitor or measure protein-protein 
interactions, drug-target interactions, and gene expression profiles, as described above. 

A particular embodiment relates to the use of the polynucleotides of the present invention to 
generate a transcript image of a tissue or cell type. A transcript image represents the global pattern of . 
gene expression by a particular tissue or cell type. Global gene expression patterns are analyzed by 
20 quantifying the number of expressed genes and their relative abundance under given conditions and at 
a given time. (See Seilhamer et al., "Comparative Gene Transcript Analysis," U.S. Patent No. 
5,840,484, expressly incorporated by reference herein.) Thus a transcript image may be generated by 
hybridizing the polynucleotides of die present invention or their complements to the totality of 
transcripts or reverse transcripts of a particular tissue or cell type. In one embodiment, the 
25 hybridization takes place in high-throughput format, wherein the polynucleotides of the present 
invention or their complements comprise a subset of a plurality of elements on a microarray. The 
resultant transcript image would provide a profile of gene activity. 

Transcript images may be generated using transcripts isolated from tissues, cell lines, biopsies, 
or other biological samples. The transcript image may thus reflect gene expression in vivo, as in the 
30 case of a tissue or biopsy sample, or in vitro , as in flie case of a cell liue. 

Transcript images which profile the egression of the polynucleotides of the preset invention 
may also be used in conjunction with in vitro model systems and preclinical evaluation of 
. pharmaceuticals, as well as toxicological testmg of industrial and naturally-occurring enviromnental 
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conqiouiids. AD compounds indace characterisdc gene expression patterns, frequently tenned 
molecular fingerprints or toxicant signatures, which are indicative of mechanisms of action and toxicity 
(Nuwaysir, B.F. et aL (1999) MoL Carcinog. 24:153-159; Steiner, S. and N.L. Anderson (2000) 
Toxicol. Lett 1 12-1 13 :467-47 1 , e^ressly incorporated by reference herein)^ If a test compound has a 

5 signature similar to that of a compound with known toxicity, it is likely to share those toxic properties. 
These fingerprints or signatures are most useful and refined when they contain exptession information 
fivm a large niunber of genes and gene families. Ideally, a genome-wide measurement of expression 
provides the highest quality signature. Even genes whose expression is not altered by any tested 
compounds are important as well, as the levels of expression of these genes are used to normalize the 

10 rest of the expression data. The normalization procedure is useful for comparison of expression data 
after treatment with different compounds. While the assignment of gene function to elements of a 
toxicant signature aids in interpretation of toxicity mechanisms, knowledge of gene function is not 
necessary for the statistical matching of signatures whidi leads to prediction of toxicity. (See, for 
example, Press Release 00-02 fix>m the National Institute of Environmental Health Sciences, released 

15 February 29, 2000, available at http://www.niehs.nih.gov/oc/news/loxchip.htm.) Therefore, it is 
important and desirable in toxicological screening using toxicant signatures to include all expressed 
gene sequences. 

In one embodiment, the toxicity of a test compound is assessed by treating a biological sample 
containing nucleic acids with the test compound. Nucleic acids that are expressed in the treated 

20 biological sample are hybridized with one or more probes specific to the polynucleotides of the present 
invention, so that transcript levels corresponding to the polynucleotides of the present invention may be 
quantified. The transcript levels in the treated biological sample are compared with levels in an 
untreated biological sample. Differences in the transcript levels between the two samples are 
indicative of a toxic response caused by the test compound in the treated sample. 

25 Another particular embodiment relates to the use of the polypeptide sequences of the present 

invention to analyze the proteome of a tissue or cell type. The term proteome refers to the global 
pattern of protein expression in a particular tissue or cell type. Each protein component of a proteome 
can be subjected individually to further analysis. Proteome expression patterns, or profiles, are 
analyzed by quantifying the number of expressed proteins and their relative abundance under given 

30 conditions and at a given time. A profile of a cell's proteome may thus be generated by separating 
and analyzing the polypeptides of a particular tissue or cell type. In one embodiment, the separation is 
achieved using two-dimensional gel electrophoresis, in which proteins from a sample are separated by 
isoelectric focusing in the first dimension, and then according to molecular weight by sodium dodecyl 
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sulfate slab gel electrophoresis in the second dimension (Steiner and Anderson, supraY The proteins 
are visualized in the gel as discrete and uniquely positioned spots, typically by staining the gel with an 
agent such as Coomassie Bhie or silver or fluorescent stains. The optical density of each protein spot 
is generally proportional to the level of the protein in the sample. The optical densities of equivalently 

S positioned protein spots from different samples, for example, from biological samples either treated or 
untreated with a test compound or therapeutic agent, are compared to identify any changes in protein 
spot density related to the treatment. The proteins in fiie spots are partially sequenced using, for 
exan^le, standard methods employing chemical or enzymatic cleavage followed by mass 
spectrometry. The identity of the protein in a spot may be determined by comparing its partial 

10 sequence, preferably of at least S contiguous amino acid residues, to the polypeptide sequences of the 
present invention. In some cases, further sequence data may be obtained for definitive protein 
identification. 

A proteomic profile may also be generated using antibodies specific for CSAP to quantify the 
levels of CSAP e^q^ression In one embodiment, the antibodies are used as elements on a microarray, 

15 and protein expression levels are quantified by exposing the microarray to the sample and detecting 
the levels of protein bound to each array element (Lueking, A. et al. (1999) Anal. Biochem. 270:103- 
111; Mendoze, L.G. et al. (1999) Biotechniques 27:778-788). Detection may be performed by a 
variety of methods known in the art, for example, by reacting the proteins in the sample with a thiol- or 
amino-reactive fluorescent compound and detecting the amount of fluorescence bound at each array 

20 element. 

Toxicant signatures at the proteome level are also useful for toxicological screening, and 
should be analyzed in parallel with toxicant signatures at the transcript leveL There is a poor 
correlation between transcript and protein abundances for some proteins in some tissues (Anderson, 
N.L. and J. Seilhamer (1997) Electrophoresis 18:533-537), so proteome toxicant signatures maybe 

25 useful in the analysis of compounds which do not significantly affect the transcript image, but which 
alter the proteomic profile. In addition, the analysis of transcripts in body fluids is difficult, due to rapid 
degradation of mRNA, so proteomic profiling may be more reliable and informative in such cases. 

In another embodiment, the toxicity of a test compound is assessed by treating a biological 
sample containing proteins with the test compound. Proteins that are expressed in the treated 

30 biological sample are separated so that the amount of each protein can be quantified. The amount of 
each protein is compared to the amount of the corresponding protein in an untreated biolo^cal sample. 
A difference in the amount of protein between the two samples is indicative of a toxic response to the 
test compound in the treated sample. Individual proteins are identified by sequencing the amino add 
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residues of Qie individual proteins and comparing these partial sequences to fhe polypeptides of fhe 
present inventioiL 

In another embodiment, the toxicity of a test compound is assessed by treatmg a biological 
sample containing proteins with the test compound. Proteins from the biological sample are incubated 

5 with antibodies specific to the polypeptides of the present invention. The amount of protein recognized ' 
by the antibodies is quantified. The amount of protein in the treated biological sample is compared 
with the amount in an untreated biological sample. A difference in the amount of protein between the 
two samples is indicative of a toxic response to the test compound in the treated sample. 

Microarrays may be prepared, used, and analyzed using methods known iu the art (See, e.g., 

10 Brennan, T.M. et al. (1995) U.S. Patent No. 5,474,796; Schena, M. et aL (1996) Proc. Natl. Acad. 
Sci. USA 93:10614-10619; Baldeschweiler et al. (1995) PCT application W095/251116; Shalon, D. et 
al. (1995) PCT application WO95/35505; Heller, R.A. et aL (1997) Proc. Natl. Acad. Sd. USA 
942150-2155; and Hefler, MJ. et al (1997) U.S. Patent No. 5,605,662.) Various types of 
microarrays are well known and thoroughly described in DNA Microarrays: A Practical Approach ^ 

15 M. Schena, ed. (1999) Oxford University Press, London, hereby expressly incorporated by reference. 

In another embodiment of the invention, nucleic acid sequences encoding CSAP maybe used 
to generate hybridization probes useful in mapping the naturally occurring genomic sequence. Either 
codiug or noncoding sequences may be used, and in some mstances, noncoding sequences may be 
preferable over coding sequences. For example, conservation of a coding sequence among members 

20 of a multi-gene family may potentially cause undesired cross hybridization during chromosomal 
mapping. The sequences may be mapped to a particular chromosome, to a specific region of a 
chromosome, or to artificial chromosome constructions, e.g., human artificial chromosomes (HACs), . 
yeast artificial chromosomes (YACs), bacterial artificial chromosomes (BACs), bacterial PI 
constructions, or single chromosome cDNA libraries. (See, e.g., Harrington, J.J. et al. (1997) Nat. . 

25 Genet 15:345-355; Price, CM. (1993) Blood Rev. 7:127-134; and Trask, B.J. (1991) Trends Genet 
7:149-154.) Once mapped, flie nucleic acid sequences of the invention may be used to develop 
genetic linkage maps, for example, which correlate the inheritance of a disease state with the 
idheritance of a particular chromosome region or restriction fragment length polymorphism (RFLP). 
(See, for example. Lander, E.S. and D. Botstein (1986) Proc. NatL Acad. Sci. USA 83:7353-7357.) 

30 Fluorescent in situ hybridization (FISH) may be correlated with other physical and genetic 

map data. (See, e.g., Heinz-Ulrich, et al (1995) in Meyers, supra, pp. 965-968.) Exaiiq)les of genetic 
map data can be found in various scientific joumals or at the Online Mendelian Inheritance in Man 
(OMIM) World Wide Web site. Correlation between the location of the gene encoding CSAP on a 
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physical map and a specific disorder, or a predisposition to a specific disorder, may help define the 
legion of DNA associated with that disorder and thus may forther positional cloning efforts. 

IhsitQ hybridization of chromosomal preparations and physical mapping techniqaes, such as 
linkage analysis using established chromosomal markers, may be used for extending genetic maps. 

5 Often the placement of a gene on the chromosome of another manomalian species, such as mouse» 
may reveal associated markers even if the exact chromosomal locus is not known. This information is 
valuable to investigators searching for disease genes using positional cloning or oflier gene discovery 
techniques. Once the gene or genes responsible for a disease or syndrome have been crudely 
localized by genetic linkage to a particular genomic region, e.g., ataxia-telangiectasia to llq22-23, any 

10 sequences mapping to that area may represent associated or regulatory genes for further investigation. 
(See, e.g., Gatti, R.A. et al. (1988) Nature 336:577-580.) The nucleotide sequence of the instant 
invention may also be used to detect differences in the chromosomal location due to translocation, 
inversion, etc., among normal, carrier, or affected individuals. 

In another embodiment of the invention, CSAP, its catalytic or immunogenic fragments, or 

15 oligopeptides thereof can be used for screening libraries of compounds in any of a variety of drug . 
screening techniques. The fragment employed in such screening may be free in solution, afBxed to a 
solid support, borne on a cell surface, or located intracellularly. The formation of binding complexes 
between CSAP and the agent being tested may be measured. 

Another technique for drug screening provides for high throughput screening of compounds 

20 having suitable bindmg affinity to the protein of interest (See, e.g., Geysen, et al. (1984) PCT 
application WO84/03564.) In this method, large numbers of different small test compounds are 
synthesized on a solid substrate. The test conq)ounds are reacted with CSAP, or fragments thereof, 
and washed. Bound CSAP is then detected by methods well known in the art Purified CSAP can 
also be coated directiy onto plates for use in the aforementioned drug screening techniques. 

25 Alternatively, non-neutralizing antibodies can be used to capture the peptide and immobilize it on a 
solid support. 

In another embodiment, one may use competitive drug screening assays in which neutralizing 
antibodies capable of binding CSAP specifically compete with a test compound for binding CSAP. In 
this manner, antibodies can be used to detect the presence of any peptide which shares one or more 
30 antigenic detenninants with CSAP. 

In additional embodiments, the nucleotide sequences which encode CSAP maybe used in any 
molecular biology techniques that have yet to be developed, provided the new techniques rely on 
properties of nucleotide sequences that are currently known« including, but not limited to, such 
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properties as the triplet genetic code and specific base pair interactions. 

Without further elaboration, it is believed that one skilled in the art can, using the preceding 
description, utilize the present invention to its fullest extent The following embodiments are, therefore, 
to be construed as merely illustrative, and not limitative of the remainder of the disclosure in any way 
5 whatsoev^. 

The disclosures of aU patents, applications and publications, mentioned above and below, 
including U.S. Ser. No. 60/260,085, U.S. Ser. No. 60/268,554. U.S. Ser. No. 60/269,111. and U.S. Ser. 
No. 60/271211 are ejqpressly incorporated by reference herein. 

10 EXAMPLES 
1. Construction of cDNA Libraries 

Ihcyte cDNAs were derived from cDNA libraries described in the LIFESEQ GOLD 
database (Incyte Genomics, Palo Alto CA). Some tissues were homogenized and lysed in guanidinium 
isothiocyanate, while others were homogenized and lysed in phenol or in a suitable mixture of 

15 denaturants, such as TRIZOL (life Technologies), a monophasic solution of phenol and guanidine 
isothiocyanate. The resulting lysates were centrifiiged over CsCl cushions or extracted with 
chloroform. RNA was precipitated from the lysates with either isopropanol or sodium acetate and 
ethanol, or by other routine methods. 

Phenol extraction and precipitation of RNA were repeated as necessary to increase RNA 

20 purity. In some cases, RNA was treated with DNase. For most libraries, poly(A)+ RNA was 
isolated using oligo d(T)-coupled paramagnetic particles (Promega), OLIGOTEX latex particles 
(QIAGEN, Oiatsworth CA), or an OLIGOTEX mRNA purification kit (QL\GEN). Altematively, 
RNA was isolated directly from tissue lysates using other RNA isolation kits, e.g., the 
POLY(A)PURE mRNA purification kit (Ambion, Austin TX). 

25 . In some cases, Stratagene was provided with RNA and constructed the corresponding cDNA 

libraries. Otherwise, cDNA was synthesized and cDNA libraries were constructed with the 
UNEAP vector system (Stratagene) or SUPERSCRIPT plasmid system QJfe Technologies), using 
the recommended procedures or similar methods known in the art (See, e.g., Ausubel, 1997, supra. 
units 5. 1-6.6.) Reverse transcription was initiated using oligo d(T) or random primers. Synthetic 

30 oligonucleotide adapters were ligated to double stranded cDNA, and the cDNA was digested with the 
appropriate restriction enzyme or enzymes. For most libraries, the cDNA was size-selected (300- 
1000 bp) usmg SEPHACRYL SIOOO, SEPHAROSE CL2B, or SEPHAROSE CUB column 
chromatography (Amerdiam Pharmacia Biotech) or preparative agarose gel electrophoresis^ cDNAs 
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were ligated into compatible restriction enzyme sites of the polylinker of a suitable plasmid, e.g., 
PBLUESCRUT plasmid (Stratagpne), PSPORTl plasmid (Life Technologies), PCDNA2.1 plasmid 
(Invitrogen, Carlsbad CA), PBK-CMV plasmid (Stratagene), PCR2-T0P0TA plasmid (Invitrogen), 
PCMV-iaS plasmid (Stratagene), pIGEN (Incyte Genomics, Palo Alto CA), pRARE (Incyte 
5 Genomics), or pINCY (Incyte Genomics), or derivatives fliereof. Recombinant plasmids were 
transformed into competent E. coli ceDs including XLl-Blue, XLl-BlueMRF, or SOLR from 
Stratagene or DH5a. DHIOB, or ElectroMAX DHIOB from life Technologies. 

II. Isolation of cDNA Clones 

Plasmids obtained as described in Exanq>le I were recovered from host cells by in vivo 
10 excision using the UNIZAP vector system (Stratagene) or by cell lysis. Plasmids were pntified using 
at least one of the following: a Magic or ^VIZARD Minipreps DNA purification system (Promega); an 
AGTC Miniprep purification kit (Edge Biosystems, Gaithersburg MD); and QIAWELL 8 Plasmid, 
QIAWELL 8 Phis Plasmid, QIAWELL 8 Ultra Plasmid purification systems or the R.E, A.L. PREP 
96 plasmid purification kit from QIAGEN. Following precipitation, plasmids were resuspended in 0. 1 
15 ml of distilled water and stored, with or without lyophilization, at 4**C. 

Alternatively, plasmid DNA was amplified fix)m host cell lysates usmg direct link PGR in a 
high-throughput format (Rao, V.B. (1994) Anal. BiochenL 216:1-14). Host cell lysis and thermal 
cycling steps were cmied out in a single reaction mixture. Samples were processed and stored hi 
384-well plates, and the concentration of amplified plasmid DNA was quantified fluorometrically usmg 
20 PICOGREEN dye (Molecular Probes, Eugene OR) and a FLUOROSKAN 11 fluorescence scanner 
(Labsystems Oy, Helsinki, Finland). 

III. Sequencing and Analysis 

Incyte cDNA recovered in plasmids as described in Example n were sequenced as follows. 
Sequencing reactions were processed using standard methods or high-throughput instrumentation such 

25 as the ABI CATALYST 800 (Applied Biosystems) thermal cycler or the PTC-200 thermal cycler 
(MJ Research) in conjunction wifli the HYDRA microdispenser (Robbins Scientific) or the 
MICROLAB 2200 (Hamilton) liquid transfer system. cDNA sequencing reactions were prepared ' 
using reagents provided by Amersham Pharmacia Biotech or supplied in ABI sequencing kits such as 
the ABI PRISM BIGDYE Terminator cycle sequencing ready reaction kit (Applied Biosystems). 

30 Electrophoretic separation of cDNA sequenchig reactions and detection of labeled polynucleotides 
were carried out using the MEGABACE 1000 DNA sequencing system (Molecular Dynamics) ; the 
ABI PRISM 373 or 377 sequencing system (Applied Biosystems) in conjunction with standard ABI 
protocok and base callihg software; or other sequence analysis systems known in the art Reading 
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firames widiin the cDNA sequences were identified using standard mediods (reviewed in Ausubel, 
1997, supra, unit 7.7). Some of (he cDNA sequences were selected for extension using the 
techniques disclosed in Example Vm. 

The polynucleotide sequences derived from Incyte dDNAs were validated by removing 

5 vector, linker, and poly(A) sequences and by masking ambiguous bases, using algorithms and 

programs based on BLAST, dynamic programming, and dinucleotide nearest neighbor analy^^^ The ' 
Incyte cDNA sequences or translations thereof were then queried against a selection of public 
databases such as die GenBank primate, rodent, mammalian, vertebrate, and eukaryote databases, and 
BLOCKS, PRINTS, DOMO, PRODOM; PROTBOME databases with sequences jfrom Homo 

10 sapiens . Rattus norvegicus , Mus musculus . Caenorhabditis eleEans, Saccharomvces cerevisiae . 
Schizosaccharomvces pombe , and Candida albicans (Incyte Genoinics, Palo Alto CA); and hidden 
Markov model (HMM)-based protein family databases such as FFAM. (HMM is a probabilistic 
approach which analyzes consensus primary stractores of gene families. See, for example, Eddy» S.R. 
(1996) Curr. Opin. Struct BioL 6:361-365.) The queries were performed using programs based on 

15 BLAST, FASTA, BUMPS, and HMMER. The Incyte cDNA sequences were assembled to produce 
fun length polynucleotide sequences. Alternatively, GenBank cDNAs, GenBank ESTs, stitched 
sequences, stretched sequences, or Genscan-predicted coding sequences (see Examples IV and V) 
were used to extend Incyte cDNA assemblages to full length. Assembly was performed using 
programs based on Phred, Phrap, and Consed, and cDNA assemblages were screened for open 

20 reading frames using programs based on GeneMark, BLAST, and FASTA. The fuH length 
polynucleotide sequences were translated to derive flie corresponding full length polypeptide 
sequences. Alternatively, a polypeptide of the mvention may begin at any of the methionine residues 
of the full length translated polypeptide. Full length polypeptide sequences were subsequently 
analyzed by querying against databases such as the GenBank protein databases (genpept), SwissProt, 

25 flie PROTEOME databases, BLOCKS, PRINTS, DOMO, PRODOM, Prosite, and hidden Markov 
model (IIMM)-based protein family databases such as PFAM. Full length polynucleotide sequences 
are also analyzed using MACDNASIS PRO software (Hitachi Software Engmeeiing, South San 
Francisco CA) and LASERGENE software (DNASTAR). Polynucleotide and polypeptide sequence 
alignments are generated using default parameters specified by the CLUSTAL algorithm as 

30 incorporated into the MBGALIGN multisequence alignment program (DNASTAR), whidi also 
calculates the percent identity between aligned sequences. 

Tabk 7 sunomaxizes the tools, programs, and algoritimis used for the anal^is and assembly of 
Incyte cDNA and fiill length sequences and provides applicable descriptions, references, and threshold 
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parameters. The first cohmn of Table 7 shows the took, programs, aad algorithms used, the second 
cobmn provides brief descriptions thereof, the Hmd cohmm presents appropriate references, all of 
which are incorporated by reference hereia in their entirety, and the fourth column presents, where 
applicable, the scores, probability values, and other parameters used to evaluate the strength of a 

5 matdi between two sequences (the higher the score or the lower the probability value, the greater the 
identity between two sequences). 

The programs described above for the assembly and analysis of full length polynucleotide and 
polypeptide sequences were also used to identify polynucleotide sequence fragments from SEQ ID 
NO:19-36. Fragments from about 20 to about 4000 nucleotides which are useful in hybridization and 

10 amplification technologies are described in Table 4, column 2. 

IV. Identification and Editing of Coding Sequences from Genomic DNA 

Putative cytoskeleton-associated proteins were initially identified by running the Genscan gene 
identification program against public genomic sequence databases (e.g., gbpri and gbhtg). Genscan is 
a general-purpose gene identification program which analyzes genomic DNA sequences from a 

15 variety of organisms (See Burge, C. and S. Karlin (1997) J. MoL BioL 268:78-94, and Burge, C. and 
S. Karlin (1998) Chirr. Opin. Struct. BioL 8:346-354). The program concatenates predicted exons to 
form an assembled cDNA sequence extending from a methionine to a stop codon. The output of 
Genscan is a FASTA database of polynucleotide and polypeptide sequences. The maximum range of 
sequence for Genscan to analyze at once was set to 30 kb. To determine which of these Genscan 

20 predicted cDNA sequences encode cytoskeleton-associated proteins, the encoded polypeptides were 
analyzed by querying against PFAM models for cytoskeleton-associated proteins. Potential 
cytoskeleton-associated proteins were also identified by homology to Incyte cDNA sequences that 
had been annotated as cytoskeleton-associated proteins. These selected Genscan-predicted 
sequences were then compared by BLAST analysis to the genpept and gbpri public databases. 

25 Where necessary, the Genscan-predicted sequences were then edited by comparison to the top 
BLAST hit from genpept to correct errors in the sequence predicted by Genscan, such as extra or 
omitted exons. BLAST analysis was also used to find any Incyte cDNA or public cDNA coverage of 
the Genscan-predicted sequences, &US providing evidence for transcription. When Incyte cDN A 
coverage was available, this information was used to correct or confirm the Genscan predicted 

30 sequence. FuH length polynucleotide sequences were obtained by assembling Genscan-predicted 
coding sequences with Incyte cDNA sequences and/or public cDNA sequences usmg the assembly 
process described in Example m. Alternatively, fiill length polynucleotide sequences were derived 
entirely from edited or unedited Genscan-predicted coding sequences. 
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V. Assembly of Genomic Sequence Data mth cDNA Sequence Data 
"Stitched^^ Sequences 

Partial dDNA sequences were extended with exons predicted 1>y the Genscan gene 
identification program described in Example IV. Partial cDNAs assembled as described in Example 

5 in were mapped to genomic DNA and parsed into clusters containing related cDNAs and Genscan 
exon predictions from one or more genomic sequences. Each cluster was analyzed using an algorithm 
based on graph theory and dynamic programming to integrate cDNA and genomic information, 
generating possible splice variants that were subsequently confirmed, edited, or extended to create a 
full length sequence. Sequence intervals in which the entire length of the interval was present on 

10 more than one sequence in the cluster were identified, and intervals thus identified were considered to 
be equivalent by transitivity. For example, if an interval was present on a cDNA and two genomic 
sequences, then all three intervals were considered to be equival^. This process allows unrelated 
but consecutive genomic sequences to be brought together, bridged by cDNA sequence. Intervals 
thus identified were then "stitohed" together by the stitching algorithm in the order that they appear 

15 along their parent sequences to generate the longest possible sequence, as well as sequence variants, 
linkages between intervals which proceed along one type of parent sequence (cDNA to cDNA or 
genomic sequence to genomic sequence) were given preference over linkages which change parent 
type (cDNA to genomic sequence). Hie resultant stitched sequences were translated and compared 
by BLAST analysis to the genpept and gbpri public databases. Incorrect exons predicted by Genscan 

20 were corrected by comparison to the top BLAST hit from genpept. Sequences were further extended 
with additional cDNA sequences, or by inspection of genomic DNA, when necessary. 
*^Stretched^^ Sequences 

Partial DNA sequences were extended to full length with an algorithm based on BLAST 
analysis. First, partial cDNAs assembled as described in Example IH were queried against public 

25 databases such as the GenBank primate, rodent, mammalian, vertebrate, and eukaryote databases 
using the BLAST program. The nearest GenBank protein homolog was then compared by BLAST 
analysis to either Incyte cDNA sequences or GenScan exon predicted sequences described in 
Example IV. A chimeric protein was generated by using the resultant high-scoring segment pairs 
(HSPs) to map the translated sequences onto the GenBank protein homolog. Insertions or deletions 

30 may occur in the chimeric protein with respect to the original GenBank protein homolog. The 

GenBank protein homolog, the chimeric protein, or both were used as probes to search for homologous 
genomic sequences firom the public human genome databases. Partial DNA sequences were 
therefore "stretched'' or extended by ttie addition ofhomologous genomic sequences. The resultant 
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stretched sequences were examined to detennine whedier it contained a complete gene. 
VI. Chromosomal Mapping of CSAP Encoding Polynucleotides 

Hie sequences which were nsed to assemble SEQ ID NO:19-36 were compared with 
sequences from the lucjlie LIFESEQ database and public domain databases using BLAST and other 

5 implementations of the Smith-Waterman algorithm. Sequences £rom these databases that matched 
SEQ ID NO:19-36 were assembled into clusters of contiguous and overlapping sequences using 
assembly algorithms such as Fhrap (Table 7). Radiation hybrid and genetic mapping data available 
from public resources such as the Stanford Human Genome Center (SHGC), Whitehead Institute for 
Genome Research (WIGR), and G€n6tbon were used to determine if any of the clustered sequences 

10 had been previously mapped. Inclusion of a mapped sequence in a cluster resulted in the assignment 
of all sequences of that cluster, mcluding its particular SEQ ID NO:, to that map location. 

Map locations are represented by ranges, or intervals, of human chromosomes. The map 
position of an interval, in centiMorgans, is measured relative to the terminus of the chromosome's p- 
arm. (The centiMorgan (cM) is a unit of measurement based on recombination firequencies between 

15 chromosomal markers. On average, 1 cM is roughly equivalent to 1 megabase (Mb) of DNA in 

humans, although this can vary widely due to hot and cold spots of recombination.) The cM distances 
are based on genetic markers mapped by (j^n^thon which provide boundaries for radiation hybrid 
markers whose sequences were included in each of the clusters. Human genome maps and other 
resources available to the public, such as the NCBI **GeneMap'99" World Wide Web site 

20 (http://www.ncbi.nlm.nih.gov/genemap/), can be employed to determine if previously identified disease 
genes map within or in proximity to the intervals indicated above. 

In this manner, SEQ ID NO:24 was mapped to chromosome 18 within the interval from 40.4 
to 42.7 centiMorgans. SEQ ID N0:31 was mapped to chromosome 1 within the interval from the p- 
terminus to 16.40 centiMorgans. SEQ ID NO:33 was mapped to chromosome 19 within the interval 

25 from 19.1 to 35.5 centiMorgans. SEQ ID NO:25 was mapped to chromosome 6 within the interval 
from the p-terminus to 14.2 centiMorgans, to chromosome 16 within the interval from 44.3 to 45.4 . 
centiMorgans, to chromosome 6 within the interval from 42.0 to 44.9 centiMorgans, and to 
chromosome 2 within the interval from 120.8 to 134.1 centiMorgans. More than one map location is 
reported for SEQ ID NO:25, indicating that sequences having different map locations were assembled 

30 into a single cluster. This situation occurs, for example, when sequences having strong similarity, but 
not complete identity, are assembled mto a single cluster. 
Vn. Analysis of Polynucleotide Expression 

Northern analysis is a laboratory technique used to detect the presence of a transcript of a 
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gene and involves the hybridization of a labeled nucleotide sequence to a membrane on which RNAs 
£rom a particular cell type or tissue have been bound. (See, e.g., Sambrook, supra, ch. 7; Ausubel 
ri995) supra, cL 4 and 16.1 

Analogous computer techniques applying BLAST were used to search for idratical or related 
5 molecules in cDNA databases such as GenBaoik or UEESEQ (Incyte Genomics). This analysis is 
much faster than multiple membrane-based hybridizations. In addition, tiie sensitivity of the computer 
search can be modified to detemiine whether any particular match is categorized as exact or similar. 
The basis of the search is the product score, which is defined as: 

10 BLAST Score x Percent Identity 

5 X minimum {length(Seq. 1), length(Seq. 2)} 

The product score takes into account both the degree of similarity between two sequences and the 
length of the sequence match. The product score is a normalized value between 0 and 100, and is 

IS calculated as follows: the BLAST score is multiplied by the percent nucleotide identity and the 
product is divided by (5 times the length.of the shorter of the two sequences). The BLAST score is 
calculated by assigning a score of +5 for every base that matches in a high-scoring segment pair 
(HSP), and -4 for every mismatch. Two sequences may share more than one HSP (separated by 
gaps). If there is more than one HSP, then the pair with the highest BLAST score is used to calculate 

20 the product score. The product score represents a balance between ficactional overlap and quality isx a 
BLAST alignment. For example, a product score of 100 is produced only for 100% identity over the 
entire length of the shorter of the two sequences being compared. A product score of 70 is produced 
either by 100% identity and 70% overlap at one end, or by 88% identity and 100% overlap at the 
other. A product score of 50 is produced either by 100% identity and 50% overlap at one end, or 79% 

25 identity and 100% overlap. 

Alternatively, polynucleotide sequences encoding CSAP are analyzed with respect to the 
tissue sources from which they were derived. For example, some fall length sequences are 
assembled, at least in part, with overlapping Incyte cDNA sequences (see Example 10). Each cDNA 
sequence is derived fi-om a cDNA library constructed from a human tissue. Each human tissue is 

30 classified into one of the following organ/tissue categories: cardiovascular system; connective tissue; 
digestive system; embryonic stractures; endocrine system; exocrine glands; genitalia, female; genitalia, 
male; germ cells; hemic and immune system; liver; musculoskeletal system; nervous system; 
pancreas; respiratory system; sense organs; skin; stomatognathic system; unclassified/mixed; or 
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urinary tract. Tlie number of libraries in each category is counted and divided by the total number of 
libraries across all categories. Similarly, each human tissue is classified into one of the following 
disease/condition categories: cancer, cell line, developmental, inflammation, neurological, trauma, 
cardiovascular, pooled, and other, and the number of libraries in each category is counted and divided 
S by the total number of libraries across all categories. The resulting percentages reflect the tissue- and 
disease-specific expression of cDNA encodiog CSAP. cDNA sequences and cDNA library/tissue 
information are found in the LIFESEQ GOLD database (Jncyt& Genomics, Palo Alto CA). 
Vni. Extension of CSAP Encoding Polynucleotides 

FqU length polynucleotide sequences were also produced by extension of an appropriate 

10 fragment of the full length molecule using oligonucleotide primers designed from this fragment. One 
primer was syathesized to initiate 5' extension of the known firagment, and the other primer was 
synthesized to initiate 3 ' extension of the known firagment The initial primers were designed using 
OUGO 4.06 software (National Biosciences), or another appropriate program, to be about 22 to 30 
nucleotides in length, to have a GC.content of about 50% or more, and to anneal to the target 

IS sequence at temperatures of about 68 ^^C to about 72°C. Any stretch of nucleotides which would 
result m hairpin structures and primer-primer dimerizations was avoided. 

Selected human cDNA-libraries were used to extend the sequence. If more than one 
extension was necessary or desired; additional or nested sets of primers were designed. 

High fidelity amplification was obtained by PGR using methods well known in the art. PGR 

20 was performed in 96-wen plates using the PTC-200 thermal cycler (MJ Research, Inc.). The reaction 
mix contained DNA template, 200 nmol of each primer, reaction buffer containing Mg^*, (NH4)2S04, 
and 2-mercaptoethanol, Taq DNA polymerase (Amersham Pharmacia Biotech), ELONGASB 
enzyme (life Technologies), and Pfu DNA polymerase (Stratagene), with the following parameters 
for primer pair PCI A and PQ B: Step 1: 94°C, 3 min; Step 2: 94°C, 15 sec; Step 3: 60^C, 1 min; 

25 Step 4: eS'^C, 2 min; Step 5: Steps 2, 3, and 4 repeated 20 times; Step 6: eS'^C, 5 min; Step 7: storage 
at 4°C. In the alternative, the parameters for primer pair T7 and SK+ were as follows: Step 1: 94 °C, 
3 min; Step 2: 94**C, 15 sec; Step 3: 57 ^C, 1 min; Step 4: 68 ^'C, 2 min; Step 5: Steps 2, 3, and 4 
repeated 20 times; Step 6: 68 ^'C, 5 min; Step 7: storage at 4*'C. . 

The concentration of DNA in each well was determined by dispensing iOO fil PICOGREEN 

30 quantitation reagent (0.25% (v/v) PICOGREEN; Molecular Ptobes, Eugene OR) dissolved in IX TE 
and 0.5 III of undiluted PGR product into each well of an opaque fluorimeter plate (Coming Costar, 
Acton MA), allowing the DNA to bind to the reagent The plate was scanned in a Fiuoroskan n 
(Labsystems Oy, Helsinki, Finland) to measure the fluorescence of the sample and to quantify the 
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concentration of DNA. A 5 m1 to 10 fA aliquot of the reaction mixture was analyzed by 
electrophoresis on a 1 % agarose gel to detenmne which reactions were successful in extending the 
sequence. 

The extended nucleotides were desalted and concentrated, transferred to 384-wen plates, 

5 digested with CviJI cholera virus endonuclease (Molecular Biology Research, Madison WI), and 
sonicated or sheared prior to religation into pUC 18 vector (Amersham Pharmacia Biotech). For 
shotgun sequencing, the digested nucleotides were separated on low concentratioii (0.6 to 0.8%) 
agarose gels, fragments were excised, and agar digested with Agar ACE (Promega). ]&ctended 
clones were religated using T4 ligase (New England Biolabs, Beverly MA) into pUC 18 vector 

10 (Amersham Pharmacia Biotech), treated with Pfu DNA polymerase (Stratagene) to fill-in restriction 
site overhangs, and transfected into competent E. coli cells. Transformed cells were selected on 
antibiotic-contaming media, and individual colonies were picked and cultured overnight at 37®C in 384- 
well plates in lB/2x carb liquid media. 

The cells were lysed, and DNA was amplified by PGR using Taq DNA polymerase 

15 (Amersham Pharmacia Biotech) and Pfu DNA polymerase (Stratagene) with the following 

parameters: Step 1: 94 X, 3 min; Step 2: 94 X, 15 sec; Step 3: 60*^0, 1 mm; Step 4: 72°C, 2 min; Step 
5: steps 2, 3, and 4 repeated 29 times; Step 6: 72 °C, 5 min; Step 7: storage at 4X. DNA was 
quantified by PICOGREEN reagent (Molecular Probes) as described above. Samples with low DNA 
recoveries were reamplified using the same conditions as described above. Samples were diluted with 

20 20% dimethysulfoxide (1:2, v/v), and sequenced using DYENAMIC energy transfer sequencing 
primers and the DYENAMIC DIRECT kit (Amersham Pharmacia Biotech) or the ABI PRISM 
BIGDYE Terminator cycle sequencing ready reaction kit (Applied Biosystems). 

In like manner, full length polynucleotide sequences are verified using the above procedure or 
are used to obtain 5' regulatory sequences using the above procedure along with oligonucleotides 

25 designed for such extension, and an appropriate genomic library. 
K. Labeling and Use of Individual Hybridization Probes 

Hybridization probes derived from SEQ ID NO:19-36 are employed to screen cDNAs, 
genomic DNAs, or mRNAs. Although the labeling of oligonucleotides, consisting of about 20 base 
pairs, is specifically described, essentially the same procedure is used with larger nucleotide 

30 fragments. Oligonucleotides are designed using state-of-the-art software such as OLIGO 4.06 
software (National Biosciences) and labeled by combining 50 pmol of each oligomer, 250 fjCi of 
[y-33p] adenosine triphosphate (Amersham Pharmacia Biotech), and T4 polynucleotide kinase 
(DuPont NEN, Boston MA). The labeled oligonucleotides are substantially purified using a 
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SEPHADEX G-2S superfine size exdusion dextran bead coluiBn (Amersham Fhaimacia Biotech). 
An aliquot containing 10^ counts per minute of the labeled probe is used in a typical membrane-based 
h}^ridization analysis of human genomic DNA digested with one of Ihe following eudonucleases: Ase 
I, Bgin, Eco RI. Pst I,Xba I, orPvu n (DuPontNEN). 

5 The DNA from each digest is fi-actionated on a 0.7% agarose gel and transferred to nylon 

membranes (Nytran Phis, Schleicher & Schuell, Durham NH). Hybridization is carried out for 16 
hours at 40 ''C. To remove nonspecific signals, blots are sequentially washed at room temperature 
under conditions of up to, for example, 0.1 x saline sodium citrate and 0.5% sodium dodecyl sulfate. 
Hybridization patterns are visualized using autoradiography or an alternative imaging means and 

10 compared. 

X. Microarrays 

The linkage or synthesis of array elements upon a microarray can be achieved utilizing 
photolithography, piezoelectric printmg (ink-jet printing. See, e.g., Baldeschweiler, supra .\ mechanical 
microspotting technologies, and derivatives thereof. The substrate in each of the aforementioned 

15 technologies should be uniform and solid with a non-porous surface (Sdiena (1999), supra) . 

Suggested substrates include silicon, silica, glass slides, glass chips, and silicon wafers. Alternatively, a 
procedure analogous to a dot or slot blot may also be used to arrange and link elements to the surface 
of a substrate using thermal, UV, chemical, or mechanical bonding procedures. A typical array may 
be produced using available methods and machines well known to those of ordinary skill in the art and 

20 may contain any appropriate number of elements. (See, e.g., Schena, M. et al. (1995) Science 

270:467-470; Shalon, D. et al. (1996) Genome Res. 6:639"645; Marshall, A. and J. Hodgson (1998) 
Nat. Biotechnol. 16:27-31.) 

Full length cDNAs, Expressed Sequence Tags (ESTs), or fragments or oligomers thereof may 
comprise the elements of the microarray. Fragments or oligomers suitable for hybridization can be 

25 selected usmg software well known in the art such as LASERGENE software (DNASTAR). The 
array elements are hybridized wifli polynucleotides in a biological sample. The polynucleotides in the 
biological sample are conjugated to a fluorescent label or other molecular tag for ease of detection. 
After hybridization, nonhybridized nucleotides from the biological sample are removed, and a 
fluorescence scanner is used to detect hybridization at each array element Alternatively, laser 

30 desorbtion and mass spectrometry may be used for detection of hybridization. Hie degree of 

complementarity and the relative abundance of each polynucleotide which hybridizes to an element on 
the microarray may be assessed. In one embodiment, microarray preparation and usage is described 
in detail below. 
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Tissue or Cell Sample Preparation 

Total RNA is isokted from tissue samples using the guanidinium thiocyanate method and 
poly(A)* RNA is purified using the ollgo-(dT) cellulose method. Each poly(A)* RNA sample is 
reverse transcribed using MMLV reverse-transcriptase, 0.05 pg//il oligo-(dT) primer (21mer), IX first 

5 strand bujffer, 0.03 units//il RNase inhibitor, 500 fM dATP, 500 dGTP. 500 /iM dTTP, 40 ijM 
dCTP,40/iMdCTP<:y3(BDS)ord(nP-Cy5(AnDiBrd^ The reverse 

transcription reaction is performed in a 25 ml volume containing 200 ng poly(A)^ RNA with 
GEMBRIGHT kits (Incyte). Specific control poly(A)* RNAs are synthesized by in vitro transcription 
from non-coding yeast genomic DNA. After incubation at 37^ C for 2 br, each reaction sample (one 

10 with Cy3 and another with Cy5 labeling) is treated with 2.5 ml of 0.5M sodium hydroxide and 

incubated for 20 minutes at 85"* C to the stop the reaction and degrade the RNA. Samples are purified 
using two successive CHROMA SPIN 30 gel filtration spin columns (C3L0NTECH Laboratories, Inc. 
(OLONTECH), Palo Alto CA) and after combining, both reaction samples are ethanol precipitated 
using 1 ml of glycogen (1 mg/ml), 60 ml sodium acetate, and 300 ml of 100% ethanoL The sample is 

15 then dried to completion using a SpeedVAC (Savant Instruments Inc., Holbrook NY) and resuspended 
in 14 Hi 5X SSC/0.2% SDS. 
Microarrav Preparation 

Sequences of the present invention are used to generate array elements. Each array element 
is amplified firom bacterial cells containiug vectors with cloned cDNA inserts. PGR. amplification uses 

20 primers complementary to the vector sequences flanking the cDNA insert. Array elements are 

amplified in thirty cycles of PGR from an initial quantity of 1-2 ng to a final quantity greater flian 5 /xg. 
Amplified array elements are then purified using SEPHACRYL-400 (Amersham Pharmacia Biotech). 

Purified array elements are immobilized on polymer-coated glass slides. Glass microscope 
slides (Coming) are cleaned by ultrasound in 0.1% SDS and acetone, with extensive distiQed water 

25 washes between and after treatments. Glass slides are etched in 4% hydrofluoric acid (VWR 

Scientific Products Corporation (VWR), West Chester PA), washed extensively in distilled water, and 
coated with 0.05% aminopropyl silane (Sigma) in 95% ethanoL Coated slides are cured in a llO^C 
oven. 

Array elements are applied to the coated glass substrate using a procedure described in U.S. 
30 Patent No. 5,807,522, incorporated herein by reference. 1 /il of the array element DNA, at an average 
concentration of 100 ng//il, is loaded into the open capillary printing element by a high-speed robotic 
apparatus. The apparatus then deposits about 5 nl of array element sample per slide. 

Microarrays are UV-crosslinked using a STRATALINKER UV-crosslinker (Stratagene). 
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Microarrays are washed at room temperature once in 0.2% SDS and tJiree times in distilled water. 
Non-specific binding sites are blocked by incubation of microarrays in 0.2% casein in phosphate 
buffered saline (PBS) (Tropix, Inc., Bedford MA) for 30 minutes at 60°C followed by washes in 0.2% 
SDS and distilled water as before. 
5 Hybridization 

Hybridization reactions contain 9 ^1 of sample mixture consisting of 0.2 each of Cy3 and 
Cy5 labeled cDNA synthesis products in 5X SSC, 0.2% SDS hybridization buffer. The sample 
mixture is heated to 65^ C for 5 minutes and is aliquoted onto the microarray surface and covered with 
an 1.8 cm* coverslip. The arrays are transferred to a waterproof chamber having a cavity just slightly 
10 larger than a microscope slide. The chamber is kept at 100% humidity internally by the addition of 140 
III of 5X SSC in a comer of the chamber. The chamber containdng the arrays is incubated for about 
6.5 hours at 60* C. The arrays are washed for 10 min at 45**C in a jBrst washbuffisr (IX SSC, 0.1% 
SDS), three times for 10 minutes each at 45*" C in a second wash buffer (0. IX SSC), and dried. 
Detection 

15 Reporter-labeled hybridization complexes are detected with a microscope equipped with an 

Lmova 70 mixed gas 10 W laser (Coherent, Inc., Santa Qara CA) capable of generating spectral lines 
at 488 nm for excitation of Cy3 and at.632 nm for excitation of Cy5. The excitation laser light is 
focused on the array using a 20X microscope objective (Nikon, Inc., Melville NY). The slide 
containing the array is placed on a computer-controlled X-Y stage on the microscope and raster- 

20 scanned past the objective. The 1,8 cm x 1.8 cm array used in the present example is scanned wifli a 
resolution of 20 micrometers. 

In two separate scans, a mixed gas multiline laser excites die two fhiorophores sequentially. 
Emitted light is split, based on wavelength, into two photomultiplier tube detectors (PMT R1477, 
Hamamatsu Photonics Systems, Bridgewater NJ) corresponding to the two fluorophores. Appropriate 

25 filters positioned between the array and the photomultiplier tubes are used to filter the signals. The 
emission maxima of the fluorophores used are 565 t\Tr\ for Cy3 and 650 tim for CyS* Each array is 
typically scanned twice, one scan per fluorophore using the appropriate filters at the laser source, 
although the apparatus is capable of recording the spectra from both fhiorophores simultaneously. 
The sensitivity of the scans is typically calibrated using the signal intensity generated by a 

30 cDNA control species added to the sample mixture at a known concentration. A specific location on 
the array contains a complementary DNA sequence, allowing the intensity of the signal at that location 
to be correlated with a weight ratio of hybridizing species of 1:100,000. When two samples fit)m 
different sources (e.g., representing test and control cells), each labebd with a different fluorophore, 
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aie hybridized to a single airay fox the purpose of identifying genes that are differendaUy es^ressed, 
the calibration is done by labelmg samples of the calibrating cDNA with the two fhiorophores and 
adding identical amounts of each to the hybridization mixture. 

The ou^ut of the photomultiplier tube is digitized using a 12-bit RTI-835H ansQog-to-digital 

5 (A/D) conversion board (Analog Devices, Inc. , Norwood MA) installed m an IBM-compatible PC 
computer. The digitized data are displayed as an image where the signal intensity is mapped using a 
linear 20-color transformation to a pseudocolor scale ranging from blue (low signal) to red (high 
signal). The data is also analyzed quantitatively. Where two different fluorophores are excited and 
measured simultaneously, the data are first corrected for optical crosstalk (due to overlappiug emission 

10 spectra) between the fluorophores using each fhiorophore's emission spectrunou 

A grid is superimposed over the fluorescence signal image such that the sigoal from eadi spot 
is centered in each element of the grid. The fluorescence signal within each element is then integrated 
to obtain a numerical value corresponding to the average intensity of the signal The software used 
for signal analysis is the GEMTOOLS gene expression analysis program (lacyte). 

15 XI. Complementary Polynucleotides 

Sequences complementary to the CSAP-encoding sequences, or any parts thereof, are used 
to detect, decrease, or inhibit expression of naturally occurring CSAP. Although use of 
oligonucleotides comprising from about 15 to 30 base pairs is described, essentially the same 
procedure is used with smaller or with larger sequence fragments. Appropriate oligonucleotides are 

20 designed using OLIGO 4.06 software (National Biosciences) and the coding sequence of CSAP. To 
inhibit transcription, a complementary oligonucleotide is designed from the most unique 5' sequence 
and used to prevent promoter binding to the coding sequence. To inhibit translation, a complementary 
oligonucleotide is designed to prevent ribosomal binding to the CSAP-encoding transcript. 
Xn. Expression of CSAP 

25 Expression and pxuification of CSAP is achieved using bacterial or virus-based expression 

systems. For expression of CSAP in bacteria, cDNA is subcloned into an appropriate vector 
containing an antibiotic resistance gene and an inducible promoter that directs hi]^ levels of cDNA 
transcriptioiL Examples of such promoters include, but are not limited to, the trp-lac (tac) hybrid 
promoter and the TS or T7 bacteriophage promoter in conjunction with the lac operator regulatory 

30 element Recombinant vectors are transformed into suitable bacterial bosts, e.g., BL21(DE3). 

Antibiotic resistant bacteria express CSAP upon induction with isopropyl beta-D-tfaiogakctopyranoside 
(DPTG). Expression of CSAP in eukaryotic cells is achieved by infecting insect or manmnalian cell 
lines with recombinant Autographica califomica nuclear polj^edrosis virus (AcMNPV), commonly 
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known as baculovinis. The nonessential poljjiediin gene of baculovirus is replaced with cDNA 
encoding CS AP by either homologous recombination or bacterial-mediated transposition involving 
transfer plasmid intermediates. Viral infectivity is maintained and the strong pol^ediin promoter 
drives high levels of cDNA transcription* Recombinant baculovims is used to infect Snodoptera 
5 frueiperda (S&) insect cells in most cases, or human hepatocytes, in some cases, infection of the 

latter requires additional genetic modifications to baculovirus. (See Engelhard, E.K. et al. (1994) Proc. 
Natt. Acad. Sci. USA 91:3224-3227; Sandig, V. et aL (1996) Hum. Gene Ther. 7:1937-1945.) 

In most expression systems, CSAP is synthesized as a fusion protein with, e.g., glutathione S- 
transferase (GST) or a peptide epitope tag, such as FLAG or 6-ISs, permitting rapid, single-step, 

10 afGnity-based purification of recombinant fusion protein fiom cmde cell lysates. GST, a 26-kilodalton 
enzyme firom Schistosoma iaponicum> enables the purification of fusion proteins on immobilized 
glutathione under conditions that maintain protein activity and antigenicity (Amersham Pharmacia 
Biotech). Following purification, the GST moiety can be proteolytically cleaved fiom CSAP at 
specifically engineered sites. FLAG, an 8-amino acid peptide, enables immunoafSnity purification 

15 using coBunercially available monoclonal and polyclonal anti-FLAG antibodies (Eastman Kodak). 6r 
His, a stretch of six consecutive histidine residues, enables purification on metal-chelate resins 
(QIAGEN)- Methods for protein e:q>ression and purification are discussed in Ausubel (1995, supra , 
ch. 10 and 16). Purified CSAP obtained by these methods can be used directly in the assays shown in 
Examples XVI and XVn where applicable. 

20 XIIL Functional Assays 

CSAP function is assessed by expressing the sequences encoding CSAP at physiologically 
elevated levels in mammalian cell culture systems. cDNA is subcloned into a mammalian expression 
vector containing a strong promoter that drives high levels of cDNA expression. Vectors of choice 
include PCMV SPORT (Life Technologies) and PCR3.1 (Invitrogen, Carlsbad CA), both of which 

25 contain the cytomegalovirus promoter. 5-10 fzg of recombinant vector are transiently transfected into 
a human cell liae, for example, an endothelial or hematopoietic cell line, using either liposome 
formulations or electroporation. 1-2 fxg of an additional plasmid containing sequences encoding a 
marker protein are co-transfected. Expression of a marker protein provides a means to distinguish 
transfected cells from nontransfected cells and is a reliable predictor of cDNA egression from the 

30 recombinant vector. Marker proteins of choice inctude, e.g.. Green Fluorescent Protein (GFP; 

Clontech), CD64, or a CD64-GFP fiision protein. Flow cytometry (FCM), an automated, laser optics- 
based technique, is used to identify transfected cells e3qptessing GFP or CD64-GFP and to evaluate 
&e apoptotic state of the cells and other cellular properties. FCM detects and quantifies the uptake of 
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fluorescent molecules that diagnose events preceding or coincident with cell death. These events 
inchide changes in nuclear DNA content as measured by staining of DNA with propidium iodide; 
changes in cell size and granularity as measured by forward light scatter and 90 degree side light 
scatter; down-regulation of DNA synthesis as measured by decrease in bromodeo^ridine uptake; 
5 alterations in expression of cell surface and intracellular protems as measured by reactivity with 
specific antibodies; and alterations in plasma membrane composition as measured by the binding of 
fhiorescein-conjugated Annem V protein to the cell surface. Methods in flow cytometry are 
discussed in Ormerod, M.G. (1994) Plow Cytometry, Oxford^ New York NY. 

The influence of CSAP on gene expression can be assessed using higjhly purified populations 

10 of ceDs transfected with sequences encoding CSAP and either CD64 or CD64-GFP. CD64 and 
CD64-GFP are expressed on the surface of transfected cells and bind to conserved regions of human 
immunoglobulin G (IgG). Transfected cells are efficiently separated from nontransfected ceQs using 
magnetic beads coated with either human IgG or antibody against CD64 (DYNAL, Lake Success 
NY). mRNA can be purified fix)m the cells using methods well known by those of skill in the art 

IS Expression of mKNA encoding CSAP and other genes of interest can be analyzed by northern 
analysis or microarray techniques. 
XIV. Production of CSAP Specific Antibodies 

CSAP substantially purified using polyacrylamide gel electrophoresis (PAGE; see, e.g., 
Harrington, M.G. (1990) Methods Enzymol. 182:488-495), or other purification techniques, is used to 

20 immunize rabbits and to produce antibodies using standard protocols. 

Alternatively, the CSAP amino acid sequence is analyzed using LASERGENE software 
(DNASTAR) to determine regions of highimmunogenicity, and a corresponding oligopeptide is 
synthesized and used to raise antibodies by means known to those of skill in the art Methods for 
selection of appropriate epitopes, such as those near the C-terminus or inhydrophilic regions are well 

25 described in the art (See, e.g., Ausubel, 1995, supra , ch. 11.) 

Typically, oligopeptides of about 15 residues in length are synthesized using an ABI 431 A 
peptide synthesizer (Applied Biosystems) usiog FMCX2 chemistry and coupled to KLH (Sigma- 
Aldrich, St Louis MO) by reaction with N-maleimidobenzoyl-N-hydroxysuccinimide ester (MBS) to 
increase immunogenicity. (See, e.g., Ausubel, 1995, supra .) Rabbits are inmiunized with tie 

30 oligopeptide-KLH complex in complete Freund's adjuvant Resulting antisera are tested for 
antipeptide and anti-CS AP activity by, for exan^le, binding the peptide or CSAP to a substrate, 
blocking widi 1% BSA, reacting with rabbit antisera, washing, and reactmg wifli radio-iodinated goat 
anti-rabbit IgG. 
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XV. Purification of Naturally Occurring CSAP Using Specific Antibodies 
Naturally occurring or recombinant CSAP is substantially purified by immunoaffinity 

chromatography using antibodies specific for CSAP. An immunoafGnity column is constructed by 
covalently coupling anti--CSAP antibody to an activated chromatographic resin, such as 

5 CNBr-activated SEPHAROSE (Amersham Pharmacia Biotech). After £he couplmg, the lesin is 
blocked and washed according to the manufectorer's instractions. 

Media containing CSAP are passed over the immunoafGnity column, and the cohmn is 
washed under conditions that allow the preferential absofbance of CSAP (e.g.» high ionic strength 
buffers in the presence of detergent). The column is eluted under conditions that disrupt 

10 antibody/CS AP bmdmg (e.g. , a buffer of pH 2 to pH 3 , or a high concentration of a chaotrope, such as 
urea or thiocyanate ion), and CSAP is collected. 

XVI. Identification of Molecules Which Interact with CSAP 

CSAP, or biologically active fragments thereof, are labeled with "^I Bolton-Hunter reagent 
(See, e.g., Bolton, A.E. and W.M. Hunter (1973) Biochem. J. 133:529-539.) Candidate molecules 
15 previously arrayed in the wells of a multi-well plate are incubated with the labeled CSAP, washed, and 
any wells with labeled CSAP complex are assayed. Data obtained using different concentrations of 
CSAP are used to calculate values for the number, affinity, and association of CSAP with the 
candidate molecules. 

Alternatively, molecules interacting with CSAP are analyzed using the yeast two-hybrid 
20 system as described in Fields, S. and O. Song (1989) Nature 340:245-246, or using commercially 
available kits based on the two-hybrid system, such as the MATCHMAKER system (Clontech). 

CSAP may also be used in the PATHCALLING process (CuraGen Corp., New Haven CT) 
which employs the yeast two-hybrid system in a high-throughput manner to determine all interactions 
between the proteins encoded by two large libraries of genes ^andabalan, K. et aL (2000) U.S. 
25 Patent No. 6,057,101). 

XVn. Demonstration of CSAP Activity 

A microtubule motility assay for CSAP measures motor protein activity. In this assay, 
recombiaant CSAP is immobilized onto a gjass slide or similar substrate. Taxol-stabilized bovine brain 
30 microtubules (conunercially available) in a solution containing ATP and cytosolic extract are perfused 
onto the slide. Movement of microtubules as driven by CSAP motor activity can be visualized and 
quantified using video-enhanced light microscopy and image analysis techniques. CSAP activity is 
directly proportional to the fiequency and velocity of microtubule movement 
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Alternatively, an assay fox CSAP measures the formation of protein filaments in vitro . A 
solution of CSAP at a concentration greater than the "critical concentration" for polymer assembly is 
applied to carbon-coated grids. Appropriate nucleation sites may be supplied in the solution. The grids 
are negative stained with 0.7% (wAr) aqueous uranyl acetate and examined by electron microscopy. 
5 The appearance of filaments of approximately 25 nm (microtubules), 8 nm (actin), or 10 nm 
(intennediate filaments) is a demonstration of protein activity. 

In another alternative, CSAP activity is measured by the binding of CSAP to protein 
filaments. ^S-Met labeled CSAP sample is incubated with the appropriate filament protein (actm, 
tubulin, or intennediate filament protein) and complexed protein is collected by immunoprecipitation 
10 using an antibody against the filament protein. The immunoprecipitate is then run out on SDS-PAGE 
and the amount of CSAP bound is measured by autoradiography. 

Various modifications and variations of the described methods and systems of the invention 
win be apparent to those skilled iu the art without departing firom the scope and spirit of the invention. 
15 Although the invention has been described in connection with certain embodiments, it should be 
understood that the invention as claimed should not be unduly limited to such specific embodiments. 
Indeed, various modifications of the described modes for carrying out the invention which are obvious 
to those skilled in molecular biology or related fields are intended to be within the scope of the 
following claims. 
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What is claimed is: 

1. An isolated polypeptide selected from the group consistiBg of: 

a) a polypeptide comprising an amino acid sequence selected from the group consisting 
5 ofSEQIDNO:l-18, 

b) a polypeptide comprising a naturally occurring amino acid sequence at least 90% 

/ identical to an amino acid sequence selected from the group consisting of SEQ ID 
N0:l-6 and SEQ ID NO:10-18, 

c) a polypeptide comprising a nataraUy occurring amino acid sequence at least 96% 
10 identical to the amino add sequence of SEQ ID N0:7, 

d) a polypeptide comprising a naturally occuring amino acid sequence at least 98 % 
identical to the amino acid sequence of SEQ ID N0:8, 

e) a polypeptide comprising a naturally occuiing amino acid sequence at least 99% 
identical to the amino acid sequence of SEQ ID N0:9, 

15 f) a biologically active fragment of a polypeptide having an amino acid sequence 

selected from the group consisting of SEQ ID N0:l-18, and 
g) an inomunogenic fragment of a polypeptide having an amino acid sequence selected 
from the group consisting of SEQ ID N0:l-18. 

20 2. An isolated polypeptide of claim 1 comprising an amino acid sequence selected from the 

group consisting of SEQ ID N0:l-18. 

3. An isolated polynucleotide encoding a polypeptide of claim 1. 

25 4. An isolated polynucleotide encoding a polypeptide of claim 2. 

5. An isolated polynucleotide of claim 4 comprising a polynucleotide sequence selected from 
the group consisting of SEQ ID NO:19-36. 

30 6. A recombinant polynucleotide comprising a promoter sequence operably linked to a 

polynucleotide of claim 3. 

7. A cell transformed with a recombinant polynucleotide of claim 6. 
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8. A txansgeiuc organism comprising a recombinant polynucleotide of claim 6. 

9. A method of producing a poljrpeptide of claim 1 , the method comprisii^g: 

a) culturing a cell under conditions suitable for expression of the polypeptide, wherein 
said cell is transformed with a recombinant polynucleotide, and said recombinant 
polynucleotide comprises a promoter sequence operably linked to a polynucleotide 
encoding the polypeptide of claim 1 , and 

b) recovering the polypeptide so expressed. 

10. A method of claim 9, wherein the polypeptide comprises an amino acid sequence selected 
from the group consisting of SEQ ID N0:l-18. 

1 1. An isolated antibody which specifically binds to a polypeptide of claim 1. 

12. An isolated polynucleotide selected from the group consisting of: 

a) a polynucleotide comprising a polynucleotide sequence selected from the group 
consisting of SEQ ID NO:19-36, 

b) a polynucleotide comprising a naturally occurring polynucleotide sequence at least 
90% identical to a polynucleotide sequence selected from the group consisting of SEQ 
IDNO:19-36, 

c) a polynucleotide complementary to a polynucleotide of a), 

d) a polynucleotide complementary to a polynucleotide of b), and 

e) an RNA equivalent of a)-d). 

13. An isolated polynucleotide comprising at least 60 contiguous nucleotides of a 
polynucleotide of claim 12. 

14. A method of detecting a target polynucleotide in a sample, said target polynucleotide 
having a sequence of a polynucleotide of claim 12, the method comprising: 

a) hybridizing the sample with a probe comprising at least 20 contiguous nucleotides 

comprising a sequence complementary to said target polynucleotide in the sanq)Ie, and 
which probe specifically hybridizes to said target polynucleotide, under conditions 
whereby a hybridization complex is formed between said probe and said target 
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pol3rmicleoti<ie or fragmeats flieieof, and 
b) detecting the presence or absence of said hybridizadon conq)lex, and, optionally, if 
present, the amount thereof. 

5 15. A method of claim 14, wherein die probe comprises at least 60 contigaous nucleotides. 

16. A method of detecting a target polynucleotide in a san^le, said target polynucleotide 
having a sequence of a polynucleotide of claim 12, the method comprising: 

a) amplifying said target polymicleotide or fragment thereof using polymerase chain 
10 reaction amplification, and 

b) detecting the presence or absence of said anqilified target polynucleotide or fragm&ot 
thereof, and, optionally, if present, the amount thereof. 

17. A composition comprising a polypeptide of claim 1 and a pharmaceutically acceptable 
15 excipient 

18. A composition of claim 17, wherein the polypeptide comprises an amino add sequence 
selected from the group consisting of SEQ ID N0:l-18. 

20 19. A method for treating a disease or condition associated with decreased expression of 

functional CSAP, comprising administering to a patient in need of such treatment the composition of 
claim 17. 

20. A method of screening a compound for effectiveness as an agonist of a polypeptide of 
25 claim 1, the method comprising: 

a) exposing a sample comprising a polypeptide of claim 1 to a compound, and 

b) detecting agonist activity in the sample. 

21. A composition comprising an agonist compound identified by a method of claim 20 and a 
30 pharmaceutically acceptable excipient. 

22. A method for treating a disease or condition associated with decreased expression of 
fiinctional CSAP, comprising administering to a patient in need of such treatnaiBnt a composition of 
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claim 21. 

23. A meOiod of screening a compound for effectiv^ess as an antagonist of a polypeptide of 
claim 1, the method comprising: 

a) exposing a san:q)le comprising a polypeptide of claim 1 to a compound, and 

b) detecting antagonist acti^ty in the sample. 

24. A conqjosition comprising an antagonist compound identified by a method of claim 23 and 
a pharmaceutically acceptable excipient 

25. A method for treating a disease or condition associated with overexpression of functional 
CSAP, comprising administering to a patient in need of such treatment a composition of claun 24. 

26. A method of screening for a compound that specifically binds to the polypeptide of claim 
1, the method comprising: 

a) combining the polypeptide of claim 1 with at least one test compound under suitable 
conditions, and 

b) detecting blading of the polypeptide of claim 1 to the test compound, thereby 
identifying a compoimd that specifically binds to the polypeptide of claim 1. 

A method of screeniog for a compound that modulates the activity of the polypeptide of 
method comprising: 

combining the polypeptide of claim 1 with at least one test compound under conditions 
permissive for the activity of the polypeptide of claim 1 , 
assessing the activity of the polypeptide of claim 1 in the presence of the test 
compound, and 

comparing the activity of the polypeptide of claim 1 in the presence of the test 
compound with the activity of the polypeptide of claim 1 in the absence of the test 
compound, wherein a change in the activity of the polypeptide of claim 1 in the 
presence of the test compound is indicative of a compound that modulates the activity 
of the polypeptide of claim 1 . 

28. A method of screening a compound for effectiveness in altering e?q)ression of a target 



27. 

claim 1, the 
a) 

b) 
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poljjiiicleotide, wherein said target polynucleotide comprises a seqaence of claim 5» the method 
comprising: 

a) exposing a sample comprisiog the target polynucleotide to a compound, under 
conditions suitable for the exjiression of the target polynucleotide, 

b) detecting altered expression of the target polyaucleotide, and 

c) comparing the expression of the target polynucleotide in the presence of varying 
amounts of the compound and in the absence of the compound. 

A method of assessing toxicity of a test compound, the method comprising: 
treating a biological sample containing nucleic acids with the test compound, 
hybridizing the nucleic acids of the treated biological sample with a probe comprising 
at least 20 contiguous nucleotides of a polynucleotide of claim 12 under conditions 
whereby a specific hybridization complex is formed between said probe and a target 
polynucleotide in the biological sample, said target polynucleotide comprising a 
polynucleotide sequence of a polynucleotide of claim 12 or fragment thereof, 
quantifying the amount of hybridization complex, and 

comparing the amount of hybridization complex in the treated biological sample with 
the amount of hybridization complex in an untreated biological sample, wherein a 
difference in the amoxmt of hybridization complex in the treated biological sample is 
indicative of toxicity of the test compound. 

30. A diagnostic test for a condition or disease associated with the expression of CSAP in a 
biological sample, the method comprising: 

a) combining the biological sample with an antibody of claim 11, under conditions suitable 
for the antibody to bind the polypeptide and form an antibody:polypeptide complex, 
and 

b) detecting the complex, wherein the presence of the complex correlates with the 
presence of the polypeptide in the biolo^cal sample. 

31. The antibody of claim 11, wherein the antibody is: 

a) a chimeric antibody, 

b) a single cham antibody, 

c) a Fab fragment, 
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d) a F(ab')2 fragment, or 

e) a humanized antibody. 

32. A composition comprising an antibody of claim 11 and an acceptable excipient. 

33. A method of diagnosing a condition or disease associated with the expression of CSAP in 
a snhject, comprising administering to said subject an effective amount of the composition of claim 32. 

34. A composition of claim 32, wherein the antibody is labeled. 

35. A method of diagnosing a condition or disease associated with the expression of CSAP in 
a subject, comprising administering to said subject an effective amount of the composition of claim 34. 

36. A method of preparing a polyclonal antibody with the specificity of the antibody of claim 
11, the method comprising: 

a) immunizing an animal with a polypeptide* consisting of an amino add sequence 
selected from the group consisting of SEQ ID N0:l-18, or an immunogenic fitigment 
thereof, under conditions to elicit an antibody response, 

b) isolating antibodies from said animal, and 

c) screening the is;olated antibodies with the polypeptide, thereby identifying a polyclonal 
antibody which binds specifically to a polypeptide comprising an amino acid sequence 
selected from the group consisting of SEQ ID N0:l-18. 

37. A polyclonal antibody produced by a method of claim 36. 

38. A composition comprising the polyclonal antibody of claim 37 and a suitable carrier. 

39. A method of making a monoclonal antibody with the specificity of the antibody of claim 
11, the method comprising: 

a) immunizing an animal with a polypeptide consisting of an amino acid sequence 
selected from the group consisting of SEQ ID N0:l-18, or an immunogenic fragment 
thereof, under conditions to elicit an antibody response, 

b) isolating antibody producing cells from the animal, 
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c) fbsing the antibody producing cells with inunortalized cells to form monoclonal 
antibody-producing hybridoma cells, 

d) . iculturing the hybridoma cells, and 

e) isolating from the culture monoclonal antibody which binds specifically to a 
polypeptide comprising an anoino acid sequence selected from the group consisting of 
SEQIDN0:1-18. 

40. Amonoclonalantibodyproducedby a method of claim 39. 

41. A composition comprising the monoclonal antibody of claim 40 and a suitable carrier. 

42. The antibody of claim 11, wherein the antibody is produced by screedng a Fab e:q)ression 

library, 

43 . The antibody of claim 1 1 , wherein the antibody is produced by screening a recombinant 
immunoglobulin library. 

44. A method of detecting a polypeptide coiiq>rising an amino acid sequence selected from 
the group consisting of SEQ ID NO: 1-1 8 in a sample, the method comprising: 

a) incubating the antibody of claim 1 1 with a sample under conditions to allow specific 
binding of the antibody and the polypeptide, and 

b) detecting specific binding, wherein specific bindmg indicates the presence of a 
polypeptide comprising an amino add sequence selected from the group consisting of 
SEQ ID NO:M 8 in the sample. 

45. A method of puriiying a polypeptide comprising an amino acid sequence selected fi-om 
the group consisting of SEQ ID N0:l-1 8 fit>m a sample, the metiiod comprising: 

a) incubating the antibody of claim 1 1 with a sample under conditions to allow specific 
binding of the antibody and the polypeptide, and 

b) separating the antibody fi'om the sample and obtaining the purified polypeptide 
comprising an amino acid sequence selected from the group consisting of SEQ ID 
N0:l-18. 
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46. A microarray wherein at least one element of the microarray is a polynucleotide of claim 

13. 

47. A method of generating an expression profile of a sample which contains polynucleotides, 
5 the method comprising: 

a) labeling the polynucleotides of the sample, 

b) contacting the elements of the microarray of claim 46 with the labeled polynucleotides 
of the sample under conditions suitable for the formation of a hybridization complex, 
and 

10 c) quantifying the expression of the polynucleotides in the sample. 

48. An array comprising different nucleotide molecules affixed in distinct physical locations 
on a soUd substrate, wherein at least one of said nucleotide molecules comprises a first oligonucleotide 
or polynucleotide sequence specifically hybridizabte with at least 30 contiguous nucleotides of a target 

15 polynucleotide, and wherein said targpt polynucleotide is a polynucleotide of claim 12. 

49. An array of claim 48, wherein said fiurst oligonucleotide or polynucleotide sequence is 
completely complementary to at least 30 contiguous nucleotides of said target polynucleotide. 

20 50. An array of claim 48, wherein said first oligonucleotide or polynucleotide sequence is 

completely complementary to at least 60 contiguous nucleotides of said target polynucleotide. 

51. An array of claim 48, wherein said first oligonucleotide or polynucleotide sequence is 
completely complementary to said target polynucleotide. 

25 

52. An array of claim 48, which is a microarray. 

53. An array of claim 48, further comprising said target polynucleotide hybridized to a 
nucleotide molecule comprisiog said first oligonucleotide or polynucleotide sequence. 

30 

54. An array of claim 48, wherein a linker joins at least one of said nucleotide molecules to 
said solid substrate. 
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55. An array of claim 48, wherein each distinct physical location on the substrate contains 
nxultipie nucleotide molecules, and the multiple nucleotide molecules at any single distinct physical 
location have the same sequence^ and each distinct physical location on the substrate contains 
nucleotide molecules having a sequence which differs firom the sequence of nucleotide molecules at 

S another distinct physical location on the substrate. 

56. A polypeptide of claim 1, comprising the amino acid sequence of SEQ ID N0:1. 

57. A polypeptide of claim 1, comprising the amino acid sequence of SEQ BD N0:2. 

10* 

58. A polypeptide of claim 1, comprising the amino acid sequence of SEQ ID N0:3. 

59. A polypq)tide of claim 1, comprising the amino acid sequence of SEQ ID N0:4. 
15 60. A polypeptide of claim 1, comprising the amino acid sequence of SEQ ID N0:5. 

61. A polypeptide of claim 1, comprisiog the amino acid sequence of SEQ ID N0:6. 

62. A polypeptide of claim 1 , comprismg the amino acid sequence of SEQ ID N0:7. 

20 

63. A polypeptide of claim 1 , comprising the amino acid sequence of SEQ ED N0:8. 

64. A polypeptide of claim 1 , comprising die amino acid sequence of SEQ ID N0:9. 
25 65. A polypeptide of claim 1, con^rising the amino acid sequence of SEQ ID NO;10. 

66. Apolypeptideofclaiml, comprising the amino add sequence of SEQ ID NOrll. 

67. A polypeptide of claim 1, comprising the amino acid sequence of SEQ ID NO:12. 

68. A polypeptide of claim 1, comprising the amino acid sequence of SEQ ID N0:13. 

69. A polypeptide of claim 1, comprising the amino acid sequence of SEQ ID N0:14. 

U6 



30 



wo 02/053719 PCTAJS02/00178 

70. A polypeptide of claim 1, comprising die amino acid sequence of SEQ ID NO: 15. 

71. A polypeptide of claim 1, comprising the amino acid sequence of SEQ ID N0:16. 

72. A polypeptide of claim 1, comprising Che amino acid sequence of SEQ ID NO:17, 

73. A polypeptide of claim 1, comprising the amino acid sequence of SEQ ED NO:18. 

74. A polynucleotide of claim 12, comprising the polynucleotide sequence of SEQ ID N0:19. 

75. A polynucleotide of claun 12, comprising the polynucleotide sequence of SEQ ID NO:20. 

76. A polynucleotide of claim 12, comprising the polynucleotide sequence of SEQ ID N0:21. 

77. A polynucleotide of claim 12, comprising the polynucleotide sequence of SEQ ID NO:22. 

78. A polynucleotide of claim 12, comprismg the polynucleotide sequence of SEQ ID NO:23. 

79. A polynucleotide of claim 12, comprising the polynucleotide sequence of SEQ ID NO:24. 

80. A polynucleotide of claim 12, comprising the polynucleotide sequence of SEQ ID NO:25. 

81. A polynucleotide of claim 12, comprising the polynucleotide sequence of SEQ ID NO:26. 

82. A polynucleotide of claim 12, comprising the polynucleotide sequence of SEQ ID NO:27. 

83. A polynucleotide of claim 12, comprising the polynucleotide sequence of SEQ ID NO:28. 

84. A polynucleotide of claun 12, comprising the polynucleotide sequence of SEQ ID NO:29. 

85. A polynucleotide of claim 12, comprising the polynucleotide sequence of SEQ ID NO:30. 

86. A polynucleotide of claim 12, comprising the polynucleotide sequence of SEQ ID N0:31. 
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87. A polynucleotide of claim 12, comprising the polynucleotide sequence of SEQ ID NO:32. 

88. A polynucleotide of claim 12, comprising the polynucleotide sequence of SEQ ID NO:33. 

89. A polynucleotide of claim 12, comprising the polynucleotide sequence of SEQ ID NO:34. 

90. A polynucleotide of claim 12, comprising the polynucleotide sequence of SEQ ID NO:35. 

91. A polynucleotide of claim 12, comprising the polynucleotide sequence of SEQ ID NO:36. 
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<150> 60/260,085; 60/268,554? 60/269,111/ 60/271,211 
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<160> 36 

<170> PERL Program 

<210> 1 
<211> 377 
<212> PRT 

<213> Homo sapiens 
<220> 

<221> misc_feature 



<223> Incyte ID 


No: 


5566074CD1 






<400> 1 










Met Tyr Thr Phe 


Val 


Val Arg Asp Glu Asn 


Ser 


Ser Val TVr Ala 


1 


5 


10 




15 


Glu Val Ser Arg 


Leu 


Leu Leu Ala Thr Gly His 


Trp Lys Arg Leu 




20 


25 




30 


Arg Arg Asp Asn 


Pro 


Arg Phe Asn Leu Met 


Leu 


Gly Glu Arg Asn 




35 


40 




45 


Arg Leu Pro Phe 


Gly 


Arg Leu Gly His Glu 


Pro 


Gly Leu Val Gin 




50 


55 




60 
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Leu Vfll ARn Tvir Tvr 


Arg 


Gly 


Ala Asp Lys 


65 






70 


Ser Leu Val Lvs Leu 


lie 


Lys 


Thr Ser Pro 


80 






85 


Cva Tliir Tnri Plie Pro 


Glu 


Ser 


Tyr Val lie 


95 






100 


TmVa Tlir* Pyn Val Ala 

AJjfO XllXi XrXU VOX 


Pro 


Ala 


Gin Asn Glv 


110 






115 


J?e"i^ Aein QoT* AT*rr 1*hi* 


Asp 


Glu 


Arn Glu Phe 


125 






130 


Aqti Avrt T.\7ia f5l ii 


Asp 


Glv 


Glu Glv Asn 

wjbw 


1 40 
x% V 






145 


oox OCX Axa uxy^ axo. 


Lys 


Ol^7 

wXjf 


Glu Glv Tie 

wXU wXJf XXC5 


155 






160 


A1a Qo>* Oil 11 T.oii T.011 
aXcI 0C3X VCrXU Jjeu. JueU 


Asp 




Tl e AnT% Astn 

XXC3 AOli 


1 70 
X / y 






175 
X / ^ 


vax Xx9 vjrxn ijjfS x^x 


Xjeu 


nl 11 

uxu 


TTi » Pr>o Leu 

£1XB d.\J XlBU 


X O .J 






190 


XIX o Axy Juys Jriie •^bJt 


xxo 


Arg 


Ocsx xxj^ vax 


200 






205 


AJfX nail XXCS XjfX USU 




Arg 


Glu Glv Val 

wXU wXjr VOX 


^X J 






220 


QTvi Pt-o T\rr Hie Val 
wxu f^xv/ XjfJm nxo vcix 


Ann 


Asn 


Phe nl n A^n 


230 






235 


xnx nsAi nXB XXC3 


Gin 


ijys 


UXU XjrX iSOX 


245 






250 


OXU vjrXU noil 


rjl 11 

oxu 


Mot* 


"DVto "PVio Ta/cs 
IrJEl.CS Xrllc Xijf o 


260 






265 


jjbu ihx oex axcx ijvsu 


Asn 


Tl e 


XXIX XJBU V9XU 


975 






280 


uxn xxe Juys nxs xxe 


Tl A 
XX6 


Arg 


Ann Ovra TiAii 
ASn Vjro XiCU 


^ 7 V 






295 


Ala Tl A TViv T.v/a 
AXa XXC3 OCX XXlx Jjj^st 


IlXo 


Leu 


IrXO Xjfx wXll 








JXU 


Gly Phe Asp Phe Met 


vax 


Asp 


11 ^1 1 11 T All 

oJ.U liXU liSU 


320 






325 


Glu Val Asn Gly Ala 


Pro 


Ala 


Cys Ala Qln 


335 






340 


Leu Cys Gin Gly lie 


Val 


Asp 


He Ala He 


350 






355 


Pro Pro Asp Val Glu 


Gin 


Pro 


Gin Thr Gin 


365 






370 



Lys Leu 



<210> 2 • • • 

<211> 696 

<212> PRT 

<213> Homo sapiens 

<220> 

<221> misc_feature 

<223> Incyte ID No: 5679814CD1 

<400> 2 

Met Lys Trp Leu He Asp Pro Leu Pro Val 
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Leu Cys Arg Lys Ala 
75 

Glu Leu Ala Glu Ser 
90 

Tyr Pro Thr Asn Leu 
105 

He Gin Pro Pro He 
120 

Phe Leu Ala Ser Tyr 
135 

Val Trp He Ala Lys 
150 

Leu He Ser Ser Glu 
165 

Gin Gly Qln Val His 
180 

Leu Leu Glu Pro Gly 

195 

Leu Val Asp His Gin 
210 

Leu Arg Thr Ala Ser 
225 

Lys Thr Cys His Leu 
240 

Lys Asn Tyr Gly Lys 
255 

Glu Phe Asn Gin Tyr 
270 

Ser Ser He Leu Leu 
285 

Leu Ser Val Glu Pro 
300 

Ser Phe Gin Leu Phe 
315 

Lys Val Trp Leu He 

330 

Lys Leu Tyr Ala Glu 
345 

Ser Ser Val Phe Pro 
360 

Pro Ala Ala Phe He 
375 



Asn Val Arg Val He 
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15 10 15 

Val Ser Val Asn Val Glu Thr Cys Pro Pro Ala Trp Arg Leu Trp 
20 25 30 

Pro Thr Leu His Leu Asp Pro Leu Ser Pro Lys Asp Ala Lys Ser 
35 40 45 

He He He Ala Qlu Cys His Ser Val Asp He Lys Leu Ser Lys 
50 55 60 

Glu Gin Glu Lys Lys Leu Glu Arg His Cys Arg Ser Ala Thr Thr 
65 70 75 

Cys Asn Ala Leu Tyr Val Thr Leu Phe Gly Lys Met He Ala Arg 
80 85 90 

Ala Gly Arg Ala Gly Asn Leu Asp Lys He Leu His Gin Cys Phe 
95 100 105 

Gin Cys Gin Asp Thr Leu Ser Leu Tyr Arg Leu Val Leu His Ser 

110 115 120 

He Arg Glu Ser Met Ala Asn Asp Val Asp Lys Glu Leu Met Lys 

125 130 135 

Gin He Leu Cys Leu Val Asn Val Ser His Asn Gly Val Ser Glu 

140 145 150 

Ser Qlu Leu Met Glu Leu Tyr Pro Glu Met Ser Trp Thr Phe Leu 

155 160 165 

Thr Ser Leu He His Ser Leu Tyr Lys Met Cys Leu Leu Thr Tyr 

170 175 180 

Gly Cys Gly Leu Leu Arg Phe Qln His Leu Gin Ala Trp Qlu Thr 

185 190 195 

Val Arg Leu Qlu Tyr Leu Glu Gly Pro Thr Val Thr Ser Ser Tyr 

200 205 210 

Arg Gin Lys Leu He Asn Tyr Phe Thr Leu Gin Leu Ser Gin Asp 

215 220 225 

Arg Val Thr Trp Arg Ser Ala Asp Glu Leu Pro Trp Leu Phe Gin 

230 235 240 

Gin Gin Gly Ser Lys Gin Lys Leu His Asp Cys Leu Leu Asn Leu 

245 250 255 

Phe Val Ser Gin Asn Leu Tyr Lys Arg Gly His Phe Ala Glu Leu 

260. 265 270 

Leu Ser Tyr Trp Gin Phe Val Gly Lys Asp Lys Ser Ala Met Ala 

275 280 285 

Thr Glu Tyr Phe Asp Ser Leu Lys Gin Tyr Glu Lys Asn Cys Glu 

290 295 300 

Gly Glu Asp Asn Met Ser Cys Leu Ala Asp Leu Tyr Glu Thr Leu 

305 310 315 

Gly Arg Phe Leu Lys Asp Leu Gly Leu Leu Ser Gin Ala He Val 

320 325 330 

Pro Leu Gin Arg, Ser Leu Glu He Arg Glu Thr Ala Leu Asp Pro 

335 340 345 

Asp His Pro Arg Val Ala Qln Ser Leu His Gin Leu Ala Ser Val 

350 355 360 

Tyr Val Qln Trp Lys Lys Phe Gly Asn Ala Glu Qln Leu Tyr Lys 

365 370 375 

Gin Ala Leu Glu He Ser Glu Asn Ala Tyr Gly Ala Asp His Pro 

380 385 390 

Tyr Thr Ala Arg Glu Leu Glu Ala Leu Ala Thr Leu Tyr Qln Lys 

395 400 405 

Gin Asn Lys Tyr Glu Gin Ala Glu His Phe Arg Lys Lys Ser Phe 

410 415 420 

Lys He His Gin Lys Ala He Lys Lys Lys Oly Asn Leu Tyr Gly 
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425 430 435 

Phe Ala Leu Leu Arg Arg Arg Ala Leu Gin Leu Glu Glu Leu Thr 

440 445 450 

Leu Gly Lys Asp Thr Pro Asp Aen Ala Arg Thr Leu Asn Glu Leu 

455 460 465 

Gly Val Leu Tyr Tyr Leu Gin Asn-,Asn Leu Glu Thr Ala Asp Gin 

470 475 480 

Phe Leu Lys Arg Ser Leu Glu Met Arg Glu Arg Val Leu Gly Pro 

485 490 495 

Asp His Pro Asp Cys Ala Gin Ser Leu Asn Asn Leu Ala Ala Leu 

500 . 505 510 

Cys Asn Glu Lys Lys Gin Tyr Asp Lys Ala Glu Glu Leu Tyr Glu 

515 520 525 

Arg Ala Leu Asp lie Arg Arg Arg Ala Leu Ala Pro Asp His Pro 

530 535 540 

Ser Leu Ala Tyr Thr Val Lys His Leu Ala lie Leu Tyr Lys Lys 

545 550 555 

Met Gly Lys Leu Asp Lys Ala Val Pro Leu Tyr Glu Leu Ala Val 

560 565 570 

Glu lie Arg Gin Lys Ser Phe Gly Pro Lys His Pro Ser Val Ala 

575 580 585 

Thr Ala Leu Val Asn Leu Ala Val Leu Tyr Ser Gin Met Lys Lys 

590 595 600 

His Val Glu Ala Leu Pro Leu Tyr Glu Arg Ala Leu Lys He Tyr 

605 610 615 

Glu Asp Ser Leu Gly Arg Met His Pro Arg Val Gly Glu Thr Leu 

620 625 630 

Lys Asn Leu Ala Val Leu Ser Tyr Glu Gly Gly Asp Phe Glu Lys 

635 640 645 

Ala Ala Glu Leu Tyr Lys Arg Ala Met Glu He Lys Glu Ala Glu 

650 655 660 

Thr Ser Leu Leu Gly Gly Lys Ala Pro Ser Arg His Ser Ser Ser 

665 670 675 

Gly Asp Thr Phe Ser Leu Lys Thr Ala His Ser Pro Asn Val Phe 

680 685 690 

Leu Gin Gin Gly Gin Arg 

695 

<210> 3 

<211> 1050 

<212> PRT 

<213> Homo sapiens 

<220> 

<221> mis cofeature 

<223> Incyte ID No: 7472735CD1 

<400> 3 

Met Ala Leu Tyr Asp Glu Asp Leu Leu Lys Asn Pro Phe Tyr Leu 
1 5 10 15 

Ala Leu Gin Lys Cys Arg Pro Asp Leu Cys Ser Lys Val Ala Gin 

20 25 30 

He His Gly He Val Leu Val Pro Cys Lys Gly Ser Leu Ser Ser 

35 40 45 

Ser He Gin Ser Thr Cys Gin Phe Glu Ser Tyr He Leu He Pro 

50 55 60 
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Val Glu Glu 
Gin Gly Asn 
Ser Val Pro 
Glu Ser Phe 
Glu Ser Ser 
Lys Thr lie 
Arg Phe Asp 
Cys Olu Arg 
Leu Tyr Thr 
Lys Met Leu 
Ala Val Glu 
Lys Tyr Val 
Lys lie Thr 
Val Lys Pro 
Leu Ala Gin 
Cys Leu Arg 
Arg Val Asn 
Leu Leu Tyr 
Asn Leu Ser 
Asp Glu Leu 
Tyr lie Arg 
Gly Phe Gly 
Ser Gin Met 
Ala Ser Gly 
Asp His Asp 
Phe Cys Asp 
Pro Ser Val 
Pro Leu His 



His Phe Gin 

65 

Arg lie Lys 
80 

He Leu Phe 
95 

Ser He Leu 

110 
Glu Glu Pro 

125 
Qlu Asp Val 

140 
Arg Asn He 

155 
Lys Ser Leu 

170 
Lys Cys Leu 

185 
Ala Lys Qln 

200 
He Tyr Val 

215 
Gly Thr Met 

230 
Arg Ser Leu 

245 
Glu Phe Ser 

260 
Leu Asn Lys 

275 
Lys Val Val 

290 
Leu Glu Thr 

305 
Leu Leu Val 

320 
Tyr He Lys 

335 
Gly Tyr Cys 

350 
Qln Gly Ser 

365 
Asp Arg Leu 

380 
Thr Ser Ser 

395 
Asn Gin Lys 

410 
Lys Asp Thr 

425 
Asp Cys Glu 

440 
Val Thr Pro 

455 
Val Ala Ala 

470 



Thr Leu Asn 
Leu Gly Ala 
Glu Glu Thr 
Cys He Ala 
Leu Ala Pro 
Arg Glu Phe 
Ala Ser Phe 
Arg His His 
Gin Gin Leu 
Glu Ala Gin 
His His Glu 
Glu Ala Ser 
Gin Asp Leu 
Phe Asn He 
Cys Thr Ser 
Gin Leu He 
Met Cys Ala 
Lys Thr Glu 
Asn Phe Arg 
Leu Thr Ser 
Leu Ser Ala 
Phe Leu Lys 
Pro Thr Asp 
Glu Val Glu 
Val Gin Lys 
Lys Leu Val 
Phe Ser Arg 
Val Cys Gly 



Gly Lys Asp 
70 

Gly Phe Ala 
85 

Phe Tyr Asn 
100 

His Pro Leu 
115 

Ser Asp Pro 

130 

Leu Gly Arg 
145 

His Arg Thr 
160 

He Asp Ser 
175 

Leu Arg Asp 

190 

Met Asn Leu 
205 

He Tyr Asn 
220 

Glu Asp Ala 
235 

Gin Gin Lys 
250 

Pro Arg Ala 

265 

Pro Gin Gin 
280 

Thr Gin Ser 

295 

Asp Asp Leu 
310 

He Pro Asn 

325 

Phe Ser Ser 
340 

Phe Glu Ala 
355 

Lys Pro Pro 
370 

Gin Arg Met 

385 

Cys Leu Phe 
400 

Arg Leu Leu 
415 

Met Cys His 
430 

Ser Gly Arg 
445 

Asp Asp Arg 
460 

Gin Ala Ser 
475 



Val Phe He 
75 

Cys Leu Leu 
90 

Glu Lys Glu 
105 

Glu Lys Arg 
120 

Phe Ser Leu 
135 

His Ser Glu 
150 

Phe Arg Glu 
165 

Ala Asn Ala 

.180 

Ser His Leu 

195 

Met Lys Gin 
210 

Leu He Phe 
225 . 

Ala Phe Asn 
240 

Asp He Gly 
255 

Lys Arg Glu 
. 270 
Lys Leu Val 
285 

Pro Ser Gin 
300 

Leu Ser Val 
315 

Trp Met Ala 

330 

Leu Ala Lys 
345 

Ala He Glu 
360 

Glu Ser Glu 
375 

Ser Leu Leu 
390 

Lys His He 
405 

Ser Gin Qlu 
420 

Pro Leu Cys 
435 

Leu Asn Asp 
450 

Gly His Thr 
465 

Leu He Asp 
480 
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Leu Leu Val Ser Lys Gly Ala Met Val Asn Ala Thr Asp Tyr His 

485 490 495 

Gly Ala Thr Pro Leu His Leu Ala Cys Gin Lys Gly lyr Gin Ser 

500 505 510 

Val Thr Leu Leu Leu Leu His Tyr Lys Ala Ser Ala Glu Val Gin 

515 520 525 

Asp Asn Asn Gly Asn Thr Pro Leu His Leu Ala Cys Thr Tyr Gly 

530 535 540 

His Glu Asp Cys Val Lys Ala Leu Val Tyr Tyr Asp Val Glu Ser 

545 550 555 

Cys Arg Leu Asp lie Gly Asn Glu Lys Gly Asp Thr Pro Leu His 

560 565 570 

lie Ala Ala Arg Trp Gly Tyr Gin Gly Val lie Glu Thr Leu Leu 

575 580 585 

Gin Asn Gly Ala Ser Thr Olu lie Gin Asn Arg Leu Lys Glu Thr 

590 595 600 

Pro Leu Lys Cys Ala Leu Asn Ser Lys lie Leu Ser Val Met Glu 

605 610 615 

Ala Tyr His Leu Ser Phe Glu Arg Arg Gin Lys Ser Ser Glu Ala 

620 625 630 

Pro Val Gin Ser Pro Gin Arg Ser Val Asp Ser lie Ser Gin Glu 

635 640 645 

Ser Ser Thr Ser Ser Phe Ser Ser Met Ser Ala Ser Ser Arg Gin 

650 655 660 

Glu Glu Thr Lys Lys Asp Tyr Arg Glu Val Glu Lys Leu Leu Arg 

665 670 675 

Ala Val Ala Asp Gly Asp Leu Glu Met Val Arg Tyr Leu Leu Glu 

680 685 690 

Trp Thr Glu Glu Asp Leu Glu Asp Ala Glu Asp Thr Val Ser Ala 

695 700 705 

Ala Asp Pro Glu Phe Cys His Pro Leu Cys Gin Cys Pro Lys Cys 

710 715 720 

Ala Pro Ala Gin Lys Arg Leu Ala Lys Val Pro Ala Ser Gly Leu 

725 730 735 

Qly Val Asn Val Thr Ser Gin Asp Gly Ser Ser Pro Leu His Val 

740* 745 750 

Ala Ala Leu His Gly Arg Ala Asp Leu lie Pro Leu Leu Leu Lys 

755 760 765 

His Gly Ala Asn Ala Gly Ala Arg Asn Ala Asp Gin Ala Val Pro 

770 775 780 

Leu His Leu Ala Cys Gin Gin Gly His Phe Gin Val Val Lys Cys 

785 790 795 

Leu Leu Asp Ser Asn Ala Lys Pro Asn Lys Lys Asp Leu Ser Gly 

800 805 810 

Asn Thr Pro Leu lie Tyr Ala Cys Ser Gly Gly His His Glu Leu 

815 .820 825 

Val Ala Leu Leu Leu Gin His Gly Ala Ser lie Asn Ala Ser Asn 

830 835 840 

Asn Lys Gly Asn Thr Ala Leu His Glu Ala Val lie Glu Lys His 

845 850 855 

Val Phe Val Val Glu Leu Leu Leu Leu His Gly Ala Ser Val Gin 

860 865 870 

Val Leu Asn Lys Arg Gin Arg Thr Ala Val Asp Cys Ala Glu Gin 

875 880 885 

Asn Ser Lys lie Met Glu Leu Leu Gin Val Val Pro Ser Cys Val 

890 895 900 
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Ala Ser Leu Asp Asp Val Ala Glu Thr Asp Arg Lys Glu Tyr Val 

905 910 915 

Thr Val Lys lie Arg Lys Lys Trp Asn Ser Lys Leu Tyr Asp Leu 

920 925 930 

Pro Asp Glu Pro Phe Thr Arg Qln Phe Tyr Phe Val His Ser Ala 

935 940 945 

Gly Gin Phe Lys Gly Lys Thr Ser Arg Glu lie Met Ala Arg Asp 

950 955 960 

Arg Ser Val Pro Asn Leu Thr Glu Gly Ser Leu His Glu Pro Gly 

965 970 975 

Arg Gin Ser Val Thr Leu Arg Gin Asn Asn Leu Pro Ala Gin Ser 

980 985 990 

Gly Ser His Ala Ala Glu Lys Gly Asn Ser Asp Trp Pro Glu Arg 

995 1000 ■ 1005 

Pro Gly Leu Thr Qln Thr Gly Pro Gly His Arg Arg Met Leu Arg 
1010 1015 1020 

Arg His Thr Val Glu Asp Ala Val Val Ser Gin Gly Pro Glu Ala 
1025 1030 1035 

Ala Gly Pro Leu Ser Thr Pro Gin Glu Val Ser Ala Ser Arg Ser 
1040 1045 1050 



<210> 4 
<211> 326 
<212> PRT 

<213> Homo sapiens 
<220> 

<221> misc.f eature 

<223> Incyte ID No: 7131221CD1 

<400> 4 

Met Asn Phe Thr Val Gly Phe Lys Pro Leu Leu Gly Asp Ala His 
15 10 IS 

Ser Met Asp Asn Leu Glu Lys Gin Leu He Cys Pro He Cys Leu 
20 . 25 30 

Glu Met Phe Ser Lys Pro Val Val He Leu Pro Cys Gin His Asn 
35 40 45 

Leu Cys Arg Lys Cys Ala Asn Asp Val Phe Gin Ala Ser Asn Pro 
; 50 55 60 

Leu Trp Gin Ser Arg Gly Ser Thr Thr Val Ser Ser Qly Gly Arg 
65 70 75 

Phe Arg Cys Pro Ser Cys Arg His Glu Val Val Leu Asp Arg His 
80 85 90 

Gly Val Tyr Gly Leu Gin Arg Asn Leu Leu Val Glu Asn He He 
, 95 100 105 

Asp He Tyr Lys Qln Glu Ser Ser Arg Pro Leu His Ser Lys Ala 
110 115 120 

Glu Gin His Leu Met Cys Glu Glu His Glu Glu Glu Lys He Asn 
. 125 130 135 

He Tyr Cys Leu Ser Cys Glu Val Pro Thr Cys Ser Leu Cys Lys 
140 145 150 

Val Phe Gly Ala His Lys Asp Cys Glu Val Ala Pro Leu Pro Thr 
155 160 165 

He Tyr Lys Arg Gin Lys Asp Asn Ser Arg Arg Gin Lys Gin Leu 
170 175 180 
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Leu Asn Gin Arg Phe Glu Ser Leu Cys Ala Val Leu Glu Glu Arg 

185 190 195 

Lys Gly Glu Leu Leu Gin Ala Leu Ala Arg Glu Gin Glu Glu Lys 

200 205 210 

Leu Gin Arg Val Arg Gly Leu He Arg Gin Tyr Gly Asp His Leu 

215 220- 225 

Glu Ala Ser Ser Lys Leu Val Glu Ser Ala lie Gin Ser Met Glu 

230 235 240 

Glu Pro Gin Met Ala Leu Tyr Leu Gin Gin Ala Lys Glu Leu He 

245 250 255 

Asn Lys Val Gly Ala Met Ser Lys Val Glu Leu Ala Gly Arg Pro 

260 265 270 

Glu Pro Gly Tyr Glu Ser Met Glu Gin Phe Thr Val Arg Val Glu 

275 280 285 

His Val Ala Glu Met Leu Arg Thr He Asp Phe Gin Pro Gly Ala 

290 . 295 ' 300 

Ser Gly Glu Glu Glu Glu Val Ala Pro Asp Gly Glu Glu Gly Ser 

305 310 315 

Ala Gly Pro Glu Glu Glu Arg Pro Asp Gly Pro 

320 325 

<210> 5 

<211> 505 
<212> PRT 

<213> Homo sapiens 
<220> 

<221> itiisc_f eature 

<223> Incyte ID No: 7480551CD1 

<400> 5 

Met Leu Ser Phe Phe Arg Arg Thr Leu Gly Arg Arg Ser Met Arg 
1 5 10 15 

Lys His Ala Glu Lys Glu Arg Leu Arg Glu Ala Gin Arg Ala Ala 
20 25 30 

Thr His He Fro Ala Ala Gly Asp Ser Lys Ser He He Thr Cys 
35 40 45 

Arg Val Ser Leu Leu Asp Gly Thr Asp Val Ser Val Asp Leu Pro 
50 55 60 

Lys Lys Ala Lys Gly Gin Glu Leu Phe Asp Gin He Met Tyr His 
65 70 75 

Leu Asp Leu He Glu Ser Asp Tyr Phe Gly Leu Arg Phe Met Asp 
80 85 90 

Ser Ala Gin Val Ala His Trp Leu Asp Gly Thr Lys Ser He Lys 
95 100 105 

Lys Gin Val Lys He Gly Ser Pro Tyr Cys Leu His Leu Arg Val 
110 115 120 

Lys Phe Tyr Ser Ser Glu Pro Asn Asn Leu Arg Glu Glu Leu Thr 
125 130 135 

Arg Tyr Leu Phe Val Leu Gin Leu Lys Gin Asp He Leu Ser Gly 
140 145 150 

Lys Leu Asp Cys Pro Phe Asp Thr Ala Val Qln Leu Ala Ala Tyr 
155 160 165 

Asn Leu Gin Ala Glu Leu Gly Asp Tyr Asp Leu Ala Glu His Ser 
170 175 180 

Pro Glu Leu Val Ser Glu Phe Arg Phe Val Pro He Qln Thr Glu 
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185 




190 


Olu 


Met 


Glu 


Leu Ala 


He 


Phe Qlu Lys Trp 








200 




205 


oln 


Thr 


Pro 


Ala Gin 


Ala 


Glu Thr Asn Tyr 








215 




220 


Trp 


Leu 


Glu 


Met Tyr 


Glv 


Val Asp Met His 








230 




235 


Asp 


Qly 


Asn 


Asp TyT 


Ser 


Leu Gly Leu Thr 








245 




250 


Val 


Phe 


Glu 


Qly Asp 


Thr 


Lys Xle Gly Leu 








260 




265 


He 


"Thr 


Ara 


Leu Asp 


Phe 


Lys Lys Asn Lys 








275 




280 


Val 


Glu 


Asp 




Gin 


Glv TjVS Qlu Qlu 








290 




295 


Phe 


Airgf 


Leu 


AsD His 


Pro 


Lvs Ala Cvs Lvs 








305 




310 


Ala 


Val 


Glu 


His His 


Ala 


Phe Phe Attq Tieu 








320 




325 




Sear 


Ser 


His Ara 


Ser 


Glv Phe lie Arra 








335 




340 


Ara 


* Jr ^ 


Ser 


Glv Lvs 


Thr 


Glu Tyr Gin Thr 








350 




355 


Ala 


Ajrg" 


Arcf 


Ser Thr 


Ser 


Phe Glu Ar^ Ar0 








365 




370 






Arcr 


Thr Leu 


Gin 


Met Lys Ala Cys 








380 




385 


Qlu 


Leu 


Ser 


Val His 


Asn 


Asn Val Ser Thr 








395 




400 


Gin 


Gin 


Ala 


Trp Gly 


Met 


Atyt R^t* at a Ti^ti 

m\^\^ XliLGv 








410 




415 


lie 


Sex* 


Ser 


Ala Pro 


Val 


Pro Val Glu He 








425 




430 


SesT 


Pro 


Glv 


THt* Ann 


Gin 


Hi R Attt T.vq 

OXD LJjftS 








440 




445 




Asp 


cys 






uxy u±y Asn v7xn 








455 




• 460 


Leu 


Pro 


Pro 


Pro Gin 


Thr 


Ala His Arg Asn 








470 




475 


His 


Glu 


His 


Asn Val 


Lys 


Asn Ala Gly He 








485 




490 


Phe 


Pro 


Gly 


His Thr 


Ala 


Met Thr Glu He 








500 




505 



<210> 6 

<211> 367 

<212> PRT 

<213> Homo sapiens 

<220> 

<221> inigc_feature 

<223> Incyte ID No: 3315870CD1 

<400> 6 

Met Ala Val Leu Lys Leu Thr Asp Qln Pro 
15 10 



PCTAJS02/00178 



195 

Lys Glu Tyr Arg Gly 
210 

Leu Asn Lys Ala Lys 

225 

Val Val Lys Ala Arg 
240 

Pro Thr Gly Val Leu 
255 

Phe Phe Trp Pro Lys 
270 

Leu Thr Leu Val Val 
285 

Glu His Thr Phe Val 

300 

His Leu Trp Lye Cys 
315 

Arg Gly Pro Val Gin 
330 

Leu Gly Ser Arg Phe 
345 

Thr Lys Thr Asn Lys 

360 

Pro Ser Lys Arg Tyr 
375 

Ala Thr Lys Pro Glu 
390 

Gin Ser Asn Gly Ser 
405 

Pro Val Ser Pro Ser 

420 

Glu Asn Leu Pro Gin 
435 

Trp Leu Ser Ala Ala 
450 

Trp Asn Thr Arg Ala 
465 

Tyr Thr Asp Phe Val 
480 

Arg His Asp Val His 
495 



Pro Leu Val Gin Ala 
15 
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lie Phe 


Ser Gly 


Asp Pro Glu Glu 


He Arg 


Met Leu He His Lys 






20 


25 


30 


Thr Glu 


Asp Val 


Asn Thr Leu Asp 


Ser Glu 


Lys Arg Thr Pro Leu 






35 


40 


45 


His Val 


Ala Ala 


Phe Leu Gly Asp 


Ala Glu 


He He Glu Leu Leu 






50 


55 


60 


He Leu 


Ser Gly 


Ala Arg Val Asn 


Ala Lys 


Asp Asn Met Trp Leu 






65 


70 


75 


Tlur Paro 


Leu His 


Arg Ala Val Ala 


Ser Arg 


Ser Glu Qlu Ala Val 






80 


85 


90 


Gin Val 


Leu Xle 


Lvs His Seir Ala 


AsD Val 


Asn Ala Ara Asd Lvs 






95 


100 


105 


Asn Tjcp 


Gin Hkir 


Pro Leu His Val 

*r JJwU JtAAl9 VGItali 


Ala Ala 


Ala Asn Lvs Ala Val 






110 


115 


120 


Lys Cys 


Ala Glu 


Val Tie He Pro 




Ser Ser Val Asn Val 

1^ h^^^ V ^A«U fl0AA vQJb 






125 


130 


135 




Argf Gly 


Glv Am Tlir Ala 


Lf>ii s 


His Ala Ala Leu Asn 

J^AW* AJCU n0Al 






140 


.145 


150 


Qly His 


Val Qlu 


Val Asn Ti^u 


Leu Leu 


Ala Lvs Qlv Ala Asn 






155 


160 


165 


He Asn 


Ala Phe 


Asn Lvs Lvs Asd 


Arcr Arcr 


Ala Leu His Trp Ala 






170 


175 


180 


Ala 'ICyjc 


Met Qlv 

flow ySi^jf 


His Leu Asn Val 

«i.XO UW <^OJ^ VCAA 


Val Ala 


Leu Leu Tie Asn His 

^^^9 






185 


190 


195 


Qly Ala 


Glu Val 


Thr Cvs Lvs Asi9 


Lys Lys 


Qlv Tvr Thr Pro Leu 






200 


205 


210 


His Ala 


Ala Ala 


R^Y" A<3n mv n1 n 

n^Oll wJUj^ wXlA 


T 1 ^ Acin 


Val Val Lvs His Leu 






215 


220 


225 




T,ou Qlv 


Val Glu Tie Asi3 


Glu He 


Asn Val Tvr Qlv Asn 

^^0XX V GIX A Jr «b K^Xjr «^04A 






230 


235 


240 


Thr Ala 


Leu His 


Tl Ala Cvs TS^r 


Asn Qlv 


Qln Aso Ala Val Val 






245 


250 


255 


Asm u 


Leu Tie 


Ann TH/T^ Qlv Ala 
nci^ xjf^ wxj^ nxcn 


Asn Val 


Asn Gl n Pro Asn Asn 






260 


265 


270 


Asn QTv 


Phe Thr 


A Xw AlXO iTlAC 


Ala Ala 


Ala Ser Thr His Qlv 






275 


280 


285 


Ala T.^aij 




Ol u T^^u Val 

\SJi,\Jk V CiX 


Asn Asn 


Qlv ATa Asn Val Asn 

vxj^ fXXM V MX ns7xx 






290 


295 


300 


He Gin 


Ser Lys 


Asn Glv t.vs S^r 


Pro Leu 


His Met Thr Ala Val 






305 


310 


315 


His Qly 


Arg Phe 


Thr Arg Ser Gin 


Thr Leu 


He Gin Asn Gly Gly 






320 


325 


330 


Glu He 


Asp Cys 


Val Asp Lys Asp 


Gly Asn 


Thr Pro Leu His Val 






335 


340 


345 


Ala Ala 


Arg Tyr 


Gly His Qlu Leu 


Leu He 


Asn Thr Leu He Thr 






350 


355 


360 


Ser Gly 


Ala Asp 


Thr Ala Lys 







365 

<210> 7 

<211> 435 

<212> PRT 

<213> Homo sapiens 

<220> 

<221> Riiscfeature 
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<223> Incyte ID No: 7484690CD1 
<400> 7 

Met Arg Qlu lie Val Leu Thr Qln Thr Qly Gin Cys Gly Asn Gin 
15 10 15 

He Gly Ala Lys Gin Phe Trp Glu Val He Ser Asp Glu His Ala 

20 25 30 • 

He Asp Ser Ala Gly Thr Tyr His Gly Asp Ser His Leu Pro Leu 
35 40 45 

Glu Arg Val Asn Val His His His Glu Ala Ser Gly Gly Arg Tyr 
50 55 60 

Val Pro Arg Ala Val Leu Val Asp Leu Glu Pro Gly Thr Met Asp 
65 70 75 

Ser Val Arg Ser Gly Pro Phe Gly Gin Val Phe Arg Pro Asp Asn 
80 85 90 

Phe He Ser Arg Gin Cys Gly Ala Gly Asn Asn Trp Ala Lys Gly 
95 100 105 

Arg Tyr Thr Glu Gly Ala Glu Leu Thr Glu Ser Val Met Asp Val 

110 115 120 

Val Arg Lys Glu Ala Glu Ser Cys Asp Cys Leu Gin Gly Phe Gin 

125 130 135 

Leu Thr His Ser Leu Gly Gly Gly Thr Gly Ser Gly Met Gly Thr 

140 145 150 

Leu Leu Leu Ser Lys He Arg Glu Qlu Tyr Pro Asp Arg He He 

155 160 165 

Asn Thr Phe Ser He Leu Pro Ser Pro Lys Val Ser Asp Thr Val 

170 175 180 

Val Glu Pro Tyr Asn Val Thr Leu Ser Val His Qln Leu He Qlu 

185 190 195 

Asn Ala Asp Glu Thr Phe Cys He Asp Asn Glu Ala Leu Tyr Asp 

200 205 210 

He Cys Ser Arg Thr Leu Lys Leu Pro Thr Pro Thr Tyr Qly Asp 

215 220 225 

Leu Asn His Leu Val Ser Ala Thr Met Ser Gly Val Thr Thr Cys 

230 235 240 

Leu Arg Phe Pro Gly Gin Leu Asn Ala Asp Leu Arg Lys Leu Ala 

245 . 250 255 

Val Asn Met Val Pro Phe Pro Arg Leu His Phe Phe' Met Pro Gly 

260 265 .270 

Phe Ala Pro Leu Thr Ser Arg Gly Ser Qln Gin Tyr Arg Ala Leu 

275 280 285 

Thr Val Ala Glu Leu Thr Gin Gin Met Phe Asp Ala Lys Asn Met 

290 295 300 

Met Ala Ala Arg Asp Pro Cys His Gly Arg Tyr Leu Thr Val Ala 

305 310 315 

Ala He Phe Arg Qly Arg Met Pro Met Arg Glu Val Asp Glu Gin 

320 325 330 

Met Phe Asn He Gin Asp Lys Asn Ser Ser Tyr Phe Ala Asp* Trp 

335 340 345 

Phe Pro Asp Asn Val Lys Thr Ala Val Cys Asp He Pro Pro Arg 

350 355 360 

Gly Leu Lys Met Ser Ala Thr Phe He Gly Asn Asn Thr Ala Val 

365 370 375 

Gin Glu Leu Lys Arg Val Ser Glu Gin Phe Thr Ala Thr Phe Arg 

380 385 390 

Arg Lys Ala Phe Leu His Trp Tyr Thr Gly Glu Gly Met Asp Glu 
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395 

Met Qlu Phe Thr Glu Ala Qlu Ser 
410 

Glu Tyr Gin Oln Tyr Gin Asp Ala 
425 



<210> 8 

<211> 198 

<212> PRT 

<213> Homo sapiens 

<220> 

<221> misc_feature 

<223> Incyte ID No: 7612559CD1 

<400> 8 

Met Qly Gly Arg Lys Arg Qlu Arg 

1 5 
Thr Ser Leu Ser Qlu Ser Qlu Gly 
20 

Qlu Qlu Glu Ser Thr Ala Leu Ser 
35 

Met Leu Asn Gin Leu Arg Glu Tyr 
50 

Ser Leu Thr Trp Glu Qlu Thr Glu 
65 

Asp Phe Ser Qly Tyr Ala Met Ala 

80 

Gin Gin Glu Asp Ser Leu Qlu Lys 
95 

Leu Phe Lys Thr Arg Glu Lys Glu 
110 

lie Glu Leu Glu Leu Ala Thr Ala 
125 

Leu His Glu Tyr Met Glu Met Cys 
140 

Val Gin Met Glu Thr Cys Arg Arg 
155 

Arg Lys Ser Pro Ala Phe Thr Ala 
170 

Arg Arg Gin Ala Arg Leu Arg Thr 
185 

Thr Ala Pro 





400 




405 


Asn 


Met 


Asn Asp Leu Val 


Ser 




415 




420 


Thr Ala 


Glu Gly Qly Gly 


Val 




430 


.... . 


435 


Lys 


Ala 


Ala Val Glu Glu 


Asp 




10 




15 


Pro 


Arg 


Gin Pro Asp Gly 


Asp 




25 




30 


He 


Asn 


Qlu Qlu Met Gin 


Arg 




40 




45 


Asp 


Phe 


Qlu Asp Asp Cys 


Asp 




55 




60 


Glu 


Thr 


Leu Leu Leu Trp 


Glu 




70 




75 


Ala 


Ala 


Qlu Ala Gin Qly 


Glu 




85 




90 


Val 


lie 


Lys Asp Thr Glu 


Ser 




100 




105 


Tyr Gin 


Qlu Thr He Asp 


Gin 




115 




120 


Lys 


Asn 


Asp Met Asn Arg 


His 




130 




135 


Ser 


Met 


Lys Arg Qly Leu 


Asp 




145 




150 


Leu 


lie 


Thr Gin Ser Gly 


Asp 




160 




165 


Val 


Pro 


Leu Ser Asp Arg 


Arg 




175 




180 


Pro 


He 


Ala Met Ser His 


Leu 




190 




195 



<210> 9 

<211> 139 

<212> PRT 

<213> Homo sapiens 

<220> 

<221> misc^feature 

<223> Incyte ID No: 4940751CD1 

<400> 9 
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Met Ala Asn Ala Arg Ser Gly Val Ala Val Asn Asp Glu Cys Met 
15 10 15 

Leu Lys Phe Oly Glu Leu Gin Ser Lys Arg Leu His Arg Phe Leu 

20 25 30 

Thr Phe Lys Met Asp Asp Lys Phe Lys Glu lie Val Val Asp Gin 

35 - 40 45 

Val Oly Asp Arg Ala Hir Ser Tyr Glu Asp Phe Thr Asn Ser Leu 

50 55 60 

Pro Glu Asn Asp Cys Arg Tyr Ala He Tyr Asp Phe Asp Phe Val 

65 70 75 

Thr Ala Glu Asp Val Gin Lys Ser Arg He Phe Tyr lie Leu Trp 

80 85 90 

Ser Pro Ser Ser Ala Lys Val Lys Ser Lys Met Leu Tyr Ala Ser 

95 100 105 

Ser Asn Gin Lys Phe Lys Ser Gly Leu Asn Gly He Gin Val Glu 
110 115 .120 

Leu Gin Ala Thr Asp Ala Ser Glu He Ser Leu Asp Glu He Lys 
125 130 135 

Asp Arg Ala Arg 



<210> 10 

<211> 736 

<212> PRT 

<213> Homo sapiens 

<220> 

<221> misc_f eature 

<223> Incyte ID No: 7946761CD1 

<400> 10 

Met Thr Trp Gly Thr Pro Asp Phe Leu Asn Arg Ser Ser Thr His 
15 10 15 

Ser Ser Arg Val Pro Ser Arg Phe Pro Phe Leu Asn Glu He Val 
20 25 30 

Ala His Pro Val Ala Ser Ser His Pro Gly Ser Tyr Arg Arg Ser 
35 40 45 

Gin Thr Leu Leu Glu Arg Leu Arg Val Ser Arg Ala Pro Glu Asp 
50 55 - 60 

Thr Lys Ala Leu Glu Pro Arg Cys Oly Pro Pro Cys Gly Ala Gly 
65 70 75 

Gin Pro Gly Trp Glu Pro Cys Ser Ala Leu Glu Arg Gly Pro Pro 
80 85 90 

Ser Arg Gly Glu Glu Arg Arg Met Pro Thr Ser Pro Pro Ala Gly 
95 100 105 

Ser Arg Lys Ser Thr Asp Gin Ala Val Arg Phe Gly Pro Ser Gin . 

110 115 120 

Gly Met Cys Ser Glu Ala Arg Leu Ala Arg Arg Leu Arg Asp Ala 
125 130 135 

Leu Arg Glu Glu Glu Pro Trp Ala Val Glu Glu Leu Leu Arg Cys 
140 145 150 

Gly Ala Asp Pro Asn Leu Val Leu Glu Asp Gly Ala Ala Ala Val 
155 160 165 

His Leu Ala Ala Gly Ala Arg His Pro Arg Gly Leu Arg Cys Leu 
170 175 180 

Gly Ala Leu Leu Arg Gin Gly Gly Asp Pro Asn Ala Arg Ser Val 
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185 190 195 

Glu Ala Leu Thr Pro Leu His Val Ala Ala Ala Trp Qly Cys Arg 

200 205 210 

Arg Gly Leu Glu Leu Leu Leu Ser Gin Gly Ala Asp Pro Ala Leu 

215 220 225 

Arg Asp Gin . Asp Qly Leu Arg Pro Leu Asp Leu Ala Leu Gin Gin 

230 235 240 

Gly His Leu Glu Cys Ala Arg Val Leu Gin Asp Leu Asp Thr Arg 

245 250 255 

Thr Arg Thr Arg Thr Arg lie Gly Ala Glu Thr Gin Glu Pro Glu 

260 265 270 

Pro Ala Pro Gly Thr Pro Gly Leu Ser Gly Pro Thr Asp Glu Thr 

275 280 285 

Leu Asp Ser lie Ala Leu Gin Lys Gin Pro Cys Arg Gly Asp Asn 

290 295 300 

Arg Asp lie Gly Leu Glu Ala Asp Pro Gly Pro Pro Ser Leu Pro 

305 310 315 

Val. Pro Leu Glu Thr Val Asp Lys His Gly Ser Ser Ala Ser Pro 

320 325 330 

Pro Gly His Trp Asp Tyr Ser Ser Asp Ala Ser Phe Val Thr Ala 

335 340 345 

Val Glu Val Ser Gly Ala Glu Asp Pro Ala Ser Asp Thr Pro Pro 

350 355 360 

Trp Ala Gly Ser Leu Pro Pro Thr Arg Gin Gly Leu Leu His Val 

365 370 375 

Val His Ala Asn Gin Arg Val Pro Arg Ser Gin Gly Thr Glu Ala 

380 385 390 

Glu Leu Asn Ala Arg Leu Gin Ala Leu Thr Leu Thr Pro Pro Asn 

395 400 405 

Ala Ala Gly Phe Gin Ser Ser Pro Ser Ser Met Pro Leu Leu Asp 

410 415 420 

Arg Ser Pro Ala His Ser Pro Pro Arg Thr Pro Thr Pro Gly Ala 
'425 430 435 

Ser Asp Cys His Cys Leu Trp Glu His Gin Thr Ser lie Asp Ser 

440 445 450 

Asp Met Ala Thr Leu Trp Leu Thr Glu Asp Glu Ala Ser Ser Thr 

455 460 465 

Qly Gly Arg Glu Pro Val Gly Pro Cys Arg His Leu Pro Val Ser 

470 475 480 

Thr Val Ser Asp Leu Glu Leu Leu Lys Gly Leu Arg Ala Leu Gly 

485 490 495 

Glu Asn Pro His Pro He Thr Pro Phe Thr Arg Gin Leu Tyr His 

500 505 510 

Gin Gin Leu Glu Glu Ala Gin He Ala Pro Gly Pro Glu Phe Ser 

515 520 525 

Gly His Ser Leu Glu Leu Ala Ala Ala. Leu Arg Thr Gly Cys lie 

530 535 . 540 

Pro Asp Val Gin Ala Asp Glu Asp Ala Leu Ala Gin Gin Phe Glu 

545 550 555 

Arg Pro Asp Pro Ala Arg Arg Trp Arg Glu Gly Val Val Lys Ser 

560 565 570 

Ser Phe Thr lyr Leu Leu Leu Asp Pro Arg Glu Thr Gin Asp Leu 

575 580 585 

Pro Ala Arg Ala Phe Ser Leu Thr Pro Ala Glu Arg Leu Gin Thr 

590 595 600 

Phe He Arg Ala He Phe Tyr Val Gly Lys Gly Thr Arg Ala Arg 
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605 610 615 

Pro Tyr Val His Leu Trp Glu Ala Leu Qly His His Gly Arg Ser 

620 625 630 

Arg Lys Gin Pro His Gin Ala Qys Pro Lys Val Arg Gin lie Leu 

635 640 645 

Asp lie Trp Ala Ser Gly Cys Gly Val Val Ser Leu His Cys Phe 

650 655 660 

Gin His Val Val Ala Val Glu Ala Tyr Thr Arg Glu Ala Cys lie 

665 670 675 

Val Glu Ala Leu Gly lie Gin Thr Leu Thr Asn Gin Lys Gin Gly 

680 685 690 

His Cys Tyr Gly Val Val Ala Gly Trp Pro Pro Ala Arg Arg Arg 

695 700 705 

Arg Leu Gly Val His Leu Leu His Arg Ala Leu Leu Val Phe Leu 

710 715 720 
Ala Glu Gly Glu Arg Gin Leu His Pro Gin Asp lie Gin Ala Arg 

725 730 735 

Gly 



<210> 11 

<211> 529 

<212> PRT 

<213> Homo sapiens 

<220> 

<221> misc^feature 

<223> Incyte ID No: 328B747CD1 

<400> 11 

Met Ser Arg Gin Phe Thr Tyr Lys Ser Gly Ala Ala Ala Lys Gly 
15 10 15 

Qly Phe Ser Gly Cys Ser Ala Val Leu Ser Qly Gly Ser Ser Ser 
20 25 30 

Ser Tyr Arg Ala Gly Gly Lys Gly Leu Ser Gly Gly Phe Ser Ser 
35 40 45 

Arg Ser Leu Tyr Ser Leu Gly Gly Ala Arg Ser lie Ser Phe Asn 
50 55 60 

Val Ala Ser Gly Ser Gly Trp Ala Gly Gly Tyr Qly Phe Gly Arg 
65 70 75 

Gly Arg Ala Ser Gly Phe Ala Gly Ser Met Phe Gly Ser Val Ala 
80 85 90 

Leu Gly Ser Val Cys Pro Ser Leu Cys Pro Pro Gly Gly lie His 
95 100 105 

Gin Val Thr lie Asn Lys Ser Leu Leu Ala Pro Leu Asn Val Glu 
110 115 120 

Leu Asp Pro Glu lie Gin Lys Val Arg Ala Gin Glu Arg Glu Gin 
125 130 135 

lie Lys Val Leu Asn Asn Lys Phe Ala Ser Phe lie Asp Lys Val 
140 145 150 

Arg Phe Leu Glu Gin Gin Asn Gin Val Leu Glu Thr Lys Trp Glu 
155 160 165 

Leu Leu Gin Gin Leu Asp Leu Asn Asn Cys Lys Asn Asn Leu Glu 
170 175 180 

Pro lie Leu Glu Gly Tyr lie Ser Asn Leu Arg Lys Gin Leu Glu 
185 190 195 
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Thr Leu Ser Gly Asp Arg Val Arg Leu Asp Ser Glu Leu Arg Ser 

200 205 210 

Val Arg Olu Val Val Qlu Asp Tyr Lys Lys Arg Tyr Glu Glu Glu 

215 220 225 

lie Asn Lys Arg Thr Thr Ala Glu Asn Glu Phe Val Val Leu Lys 

230 235 240^ < 

Lys Asp Val Asp Ala Ala Tyr Thr Ser Lys Val Glu Leu Gin Ala 

245 250 255 

Lys Val Asp Ala Leu Asp Gly Qlu lie Lys Phe Phe Lys Cys Leu 

260 265 270 

Tyr Glu Gly Glu Thr Ala Gin lie Gin Ser His lie Ser Asp Thr 

275 280 285 

Ser lie He Leu Ser Met Asp Asn Asn Arg Asn Leu Asp Leu Asp 

290 295 300 

Ser He lie Ala Glu Val Arg Ala Gin Tyr Glu Glu lie Ala Arg 

305 . 310 " 315 

Lys Ser Lys Ala Glu Ala Glu Ala Leu Tyr Gin Thr Lys Phe Gin 

320 325 330 

Glu Leu Gin Leu Ala Ala Gly Arg' His Gly Asp Asp Leu Lys His 

335 340 345 

Thr Lys Asn Glu He Ser Glu Leu Thr Arg Leu He Oln Arg Leu 

350 355 360 

Arg Ser Glu He Glu Ser Val Lys Lys Gin Cys Ala Asn Leu Glu 

365 370 375 

Thr Ala He Ala Asp Ala Olu Oln Arg Gly Asp Cys Ala Leu Lys 

380 385 390 

Asp Ala Arg Ala Lys Leu Asp Glu Leu Glu Oly Ala Leu Oln Oln 

395 400 405 

Ala Lys Glu Glu Leu Ala Arg Met Leu Arg Glu Tyr Oln Glu Leu 

410 415 420 

Leu Ser Val Lys Leu Ser Leu Asp He Glu He Ala Thr Tyr Arg 

425 430 435 

Lys Leu Leu Olu Gly Glu Glu Cys Arg Met Ser Gly Glu Tyr Thr 

440 445 450 

Asn Ser Val Ser He Ser Val He Asn Ser Ser Met Ala Oly Met 

455 .460 • 465 

Ala Gly Thr Gly Ala Gly Phe Gly Phe Ser Asn Ala Gly Thr Tyr 

470 475 4B0 

Gly Tyr Trp Pro Ser Ser Val Ser Gly Gly Tyr Ser Met Leu Pro 

485 490 495 

.Gly Gly Cys Val Thr Gly Ser Oly Asn Cys Ser Pro Pro Val Val 

500 505 510 

Ser Asn Val Thr Ser Thr Ser Gly Ser Ser Gly Ser Ser Arg Oly 

515 520 525 

Val Phe Oly Oly 



<210> 12 

<211> 1367 
<212> PRT 

<213> Homo sapiens 



<220> 

<221> misc_feature 

<223> Incyte ID No: 8200016CD1 
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<400> 12 

Met Ser His Tyr His Phe lie Lys Cys Cys Cys Phe Gin Leu Cys 
15 10 15 

Asn Val Phe Arg Ser His Qlu Met Qlu lie Asp Gin Cys Leu Leu 
20 25 30 

Glu Ser Leu Pro Leu Gly Gin Arg Gin Arg -Leu Val Lys Arg Met 
35 40 45 

Arg Cys Glu Gin lie Lys Ala Tyr Tyr Glu Arg Qlu Lys Ala Phe 
50 55 60 

Gin Lys Qln Qlu Gly Phe Leu Lys Arg Leu Lys His Ala Lys Asn 
65 70 75 

Pro Lys Val His Phe Asn Leu Thr Asp Met Leu Gin Asp Ala He 
80 85 90 

He His His Asn Asp Lys Glu Val Leu Arg Leu Leu Lys Glu Gly 
95 100 105 

Ala Asp Pro His Thr Leu Val Ser Ser Gly Gly Ser Leu Leu His 

110 115 120 

Leu Cys Ala Arg Tyr Asp Asn Ala Phe He Ala Glu He Leu He 

125 130 135 

Asp Arg Gly Val Asn Val Asn His Gin Asp Glu Asp Phe Trp Thr 

140 145 150 

Pro Met His He Ala Cys Ala Cys Asp Asn Pro Asp He Val Leu 

155 160 165 

Leu Leu Val Leu Ala Gly Ala Asn Val Leu Leu Gin Asp Val Asn 

170 175 180 

Gly Asn He Pro Leu Asp Tyr Ala Val Glu Gly Thr Glu Ser Ser 

185 190 195 

Ser He Leu Leu Thr Tyr Leu Asp Glu Asn Gly Val Asp Leu Thr 

200 205 210 

Ser Leu Arg Gin Met Lys Leu Gin Arg Pro Met Ser Met Leu Thr 

215 220 225 

Asp Val Lys His Phe Leu Ser Ser Gly Gly Asn Val Asn Glu Lys 

230 235 240 

Asn Asp Glu Gly Val Thr Leu Leu His Met Ala Cys Ala Ser Gly 

245 250 255 

Tyr Lys Glu Val Val Ser Leu He Leu Glu His Gly Gly Asp Leu 

260 265 270 

Asn He Val Asp Asp Gin Tyr Trp Thr Pro Leu His Leu Ala Ala 

275 280 285 

Lys Tyr Gly Gin Thr Asn Leu Val Lys Leu Leu Leu Met His Qln 

290 295 300 

Ala Asn Pro His Leu Val Asn Cys Asn Glu Glu Lys Ala Ser Asp 

305 310 315 

He Ala Ala Ser Qlu Phe He Glu Glu Met Leu Leu Lys Ala Qlu 

320 325 330 

He Ala Trp Qlu Glu Lys Met Lys Qlu Pro Leu Ser Ala Ser Thr 

335 340 3AS 

Leu Ala Qln Glu Qlu Pro Tyr Qlu Glu He He His Asp Leu Pro 

350 355 360 

Val Leu Ser Ser Lys Leu Ser Pro Leu Val Leu Pro He Ala Lys 

365 370 375 

Gin Asp Ser Leu Leu Glu Lys Asp He Met Phe Lys Asp Ala Thr 

380 385 390 

Lys Gly Leu cys Lys Gin Qln Ser Gin Asp Ser He Pro Glu Asn 

395 400 405 

Pro Met Met Ser Gly Ser Thr Lys Pro Qlu Qln Val Lys Leu Met 
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Pro Pro Ala 
Asp Oly Ser 
Gin lie Tyr 
Tyr Lys Qlu 
Phe Ser Ser 
Phe Ser Cys 
Arg Pro Gin 
Ser Glu Ala 
Qly Ala Ser 
Qys He Leu 
Leu Ser Ser 
Arg Lys Gin 
Qlu Lys Ser 
Leu He Phe 
Tyr Gly Leu 
Gin Thr He 
Arg Glu Lys 
Phe Ser Ser 
He Leu His 
Asn Ser Ala 
Gly Met Leu 
Thr Asp He 
Thr He Gin 
Leu Tyr Ser 
Cys Leu His 
He Gly He 
Qlu Phe Glu 
His Tyr He 



410 
Pro Asn Asp 

425 
Leu Leu Tyr 

440 
Thr Phe He 

455 
Leu Pro He 

470 
Ser Gly Lys 

485 
Val Qlu Arg 

500 
Cys Phe He 

515 
Ser Lys Gin 

530 
Arg Ala Thr 

545 
Glu Ala Phe 

560 
Cys Phe He 

575 
Gin Leu Thr 

590 
Arg Leu Val 

605 

Tyr Leu Leu 

620 
His Leu Asn 

635 
Gin Asp Asp 

650 
Leu Ala Val 

665 
Leu Glu Val 

680 
Leu Gly Asp 

695 
Phe Val Ser 

710 
Gin Val Ser 

725 
Gin Tyr Phe 

740 
He Ala Glu 

755 
Arg Leu Phe 

770 
Ser Gin Asp 

785 
Leu Asp He 

800 
Gin Leu Cys 

815 
Asn Glu Val 



Asp Leu Ala 
Glu He Gin 
Gly Asp He 
Tyr Ser Ser 
Leu Cys Ser 
Ala Phe His 
Leu Ser Gly 
He He Arg 
Leu Asp Ser 
Oly His Ala 
Lys Tyr Phe 
Gly Ala Arg 
Ser Gin Pro 
Met Asp Gly 
Asn Leu Cys 
Ala Ser Thr 
Leu Lys Arg 
Glu Asn Leu 
He Arg Phe 
Asp Leu Gin 
Thr Asp Glu 
Lys Gly Asp 
Phe Phe Arg 
Ser Phe Leu 
Glu Gin Lys 
Phe Gly Phe 
Val Asn Net 
Leu Phe Leu 



415 

Thr Leu Ser 
430 

Lys Arg Phe 
445 

Leu Leu Leu 
460 

Met Val Ser 
475 

Ser Leu Pro 
490 

Gin Leu Phe 

505 

Glu Arg Gly 
520 

His Leu Thr 
535 

Arg Phe Lys 
550 

Lys Thr Thr 
565 

Glu Leu Gin 

580 

He Tyr Thr 
595 

Leu Gly Gin 
610 

Leu Ser Ala 
625 

Ala His Arg 

640 

Gly Glu Arg 
655 

Ala Leu Asn 
670 

Phe Val He 
685 

Thr Ala Leu 
700 

Leu Leu Glu 

715 

Leu Ala Ser 
730 

Met He He 

745 

Asp Leu Leu 
760 

Val Asn Thr 

775 

Ser Met Gin 
790 

Glu Glu Phe 
805 

Thr Asn Glu 
820 

His Qlu Gin 



420 

Glu Leu Asn 
435 

Gly Asn Asn 
450 

Val Asn Pro 
465 

Gin Leu Tyr 
480 

Pro His Leu 
495 

Arg Glu Gin 

510 

Ser Gly Lys 
525 

Cys Arg Ala 
540 

His Val Val 
555 

Leu Asn Asp 
570 

Phe Cys Glu 

585 

Tyr Leu Leu 
600 

Ser Asn Phe 
615 

Glu Glu Lys 
630 

Tyr Leu Asn 

645 

Ser Leu Asn 
660 

Val Val Gly 
675 

Leu Ala Ala 
690 

Asn Glu Gly 
705 

Gin Val Ala 

720 

Ala Leu Thr 
735 

Arg Arg His 
750 

Ala Lys Ser 
765 

Met Asn Ser 

780 

Thr Leu Asp 
795 

Gin Lys Asn 
810 

Lys Met His 
825 

Val Glu cys 
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830 ' 835 840 

Val Qln Glu Gly Val Thr Met Glu Thr Ala Tyr Ser Ala .Qly Asn 
845 850 855 

Oln Asn Gly Val Leu Asp Phe Phe Phe Gin Lys Pro Ser Gly Phe 
860 865 870 

Leu Thr Leu Leu Asp Glu Glu Ser Gin Met lie Trp Ser Val Glu 
875 880 885 

Ser Asn Phe Pro Lys Lys Leu Qln Ser Leu Leu Glu Ser Ser Asn 
890 895 900 

Thr Asn Ala Val Tyr Ser Pro Met Lys Asp Gly Asn Gly Asn Val 
905 910 915 

Ala Leu Lys Asp His Gly Thr Ala Phe Thr lie Met His Tyr Ala 
920 925 930 

Gly Arg Val Met Tyr Asp Val Val Gly Ala lie Glu Lys Asn Lys 
935 940 945 

Asp Ser Leu Ser Gin Asn Leu Leu Phe Val Met Lys Thr Ser Glu 
950 955 960 

Asn Val Val Xle Asn His Leu Phe Gin Ser Lys Leu Ser Gin Thr 
965 970 975 

Qly Ser Leu Val Ser Ala Tyr Pro Ser .Phe Lys Phe Arg Qly His 
980 985 990 

Lys Ser Ala Leu Leu Ser Lys Lys Met Thr Ala Ser Ser He He 
995 1000 1005 

Gly Glu Asn Lys Asn Tyr Leu Glu Leu Ser Lys Leu Leu Lye Lys 
1010 1015 1020 

Lys Gly Thr Ser Thr Phe Leu Gin Arg Leu Glu Arg Gly Asp Pro 
1025 1030 1035 

Val Thr He Ala Ser Gin Leu Arg Lys Ser Leu Met Asp He He 
1040 1045 1050 

Gly Lys Leu Gin Lys Cys Thr Pro His Phe He His Cys He Arg 
1055 1060 1065 

Pro Asn Asn Ser Lys Leu Pro Asp Thr Phe Asp Asn Phe Tyr Val 
1070 1075 1080 

Ser Ala Gin Leu Gin Tyr He Gly Val Leu Glu Met Val Lys He 
1085 1090 1095 

Phe Arg Tyr Gly Tyr Pro Val Arg Leu Ser Phe Ser Asp Phe Leu 
1100 1105 1110 

Ser Arg Tyr Lys Pro Leu Ala Asp Thr Phe Leu Arg Glu Lys Lys 
1115 1120 1125 

Glu Gin Ser Ala Ala Glu Arg Cys Arg Leu Val Leu Gin Gin Cys 
1130 1135 1140 

Lys Leu Gin Gly Trp Qlri Met Gly Val Arg Lys Val Phe Leu Lys 
1145 1150 1155 

Tyr Trp His Ala Asp Gin Leu Asn Asp Leu Cys Leu Gin Leu Gin 
1160 1165 1170 

Arg Lys He He Thr Cys Gin Lys Val He Arg Gly Phe Leu Ala 
1175. 1180 1185 

Arg Gin His Leu Leu Gin Arg Met Ser He Arg Gin Gin Glu Val 
1190 1195 1200 

Thr Ser He Asn Ser Phe Leu Gin Asn Thr Glu Asp Met Gly Leu 
1205 1210 1215 

Lys Thr Tyr Asp Ala Leu Val He Gin Asn Ala Ser Asp He Ala 
1220 1225 1230 

Arg Glu Asn Asp Arg Leu Arg Ser Glu Met Asn Ala Pro Tyr His 
1235 1240 1245 

Lys Glu Lys Leu Glu Val Arg Asn Met Gin Glu Glu Gly Ser Lys 
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1250 1255 1260 

Arg Thr Asp Asp Lys Ser Gly Pro Arg His Phe His Pro Ser Ser 

1265 1270 1275 

Met Ser Val Cys Ala Ala Val Asp Gly Leu Gly Gin Cys Leu Val 

1280 1285 1290 

Gly Pro Ser He Trp^ Ser Pro Ser Leu His Ser Val Phe. Ser Met 

1295 1300 1305 

Asp Asp Ser Ser Ser Leu Pro Ser Pro Arg Lys Gin Pro Pro Pro 

1310 1315 1320 

Lys Pro Lys Arg Asp Pro Asn Thr Arg Leu Ser Ala Ser Tyr Qlu 

1325 1330 1335 

Ala Val Ser Ala Cys Leu Ser Ala Ala Arg Glu Ala Ala Asn Olu 

1340 1345 1350 

Gly Gin Pro Trp Gly Gly Thr Gin Pro Arg Val Pro Gly Ser Arg 

1355 1360 1365 

Met Leu 



<210> 13 

<211> 929 

<212> PRT 

<213> Hoiao sapiens 

<220> 

<22i> misc^f eature 

<223> Incyte ID No; 3291962CD1 

<400> 13 

Met Ala Glu Val Glu Ala Val Gin Leu Lys Glu Qlu Gly Asn Arg 

1 5 10 15 

His P}ie Gin Leu Gin Asp Tyr Lys Ala Ala Thr Asn Ser Tyr Ser 

20 25 30 

Gin Ala Leu Lys Leu Thr Lys Asp Lys Ala Leu Leu Ala Thr Leu 

35 40 45 

Tyr Arg Asn Arg Ala Ala Cys Gly Leu Lys Thr Glu Ser Tyr Val 

50 55 60 

Gin Ala Ala Ser Asp Ala Ser Arg Ala He Asp He Asn Ser Ser 

65 70 75 

Asp He Lys Ala Leu Tyr Arg Arg Cys Gin Ala Leu Glu His Leu 

80 85 90 

Gly Lys Leu Asp Gin Ala Phe Lys Asp Val Gin Arg Cys Ala Thr 

95 100 105 

Leu Qlu Pro Arg Asn Gin Asn Phe Gin Qlu Met Leu Arg Arg Leu 

110 115 120 

Asn Thr Ser He Gin Glu Lys Leu Arg Val Gin Phe Ser Thr Asp 

125 130 135 

Ser Arg Val Qln Lys Met Phe Qlu He Leu Leu Asp Glu Asn Ser 

140 145 150 

Glu Ala Asp Lys Arg Glu Lys Ala Ala Asn Asn Leu He Val Leu 

155 160 165 

Gly Arg Glu Glu Ala Gly Ala Qlu Lys He Phe Gin Asn Asn Gly 

170 175 180 

Val Ala Leu Leu Leu Gin Leu Leu Asp Thr Lys Lys Pro Glu Leu 

185 190 195 

Val Leu Ala Ala Val Arg Thr Leu Ser Gly Met Cys Ser Gly His 

200 205 210 
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Gin Ala Arg Ala Thr Val lie Leu His Ala Val Arg lie Asp Arg 

215 220 225 

He Cys Ser Leu Met Ala Val Olu Asn Qlu Qlu Met Ser Leu Ala 

230 235 240 

Val cys Asn Leu Leu Qln Ala He He Asp Ser Leu Ser Gly Glu 
. 245 — — 250 255 
Asp Lys Arg Glu His Arg Qly Lys Glu Glu Ala Leu Val Leu Asp 

260 265 270 

Thr Lys Lys Asp Leu Lys Qln He Thr Ser His Leu Leu Asp Met 

275 280 285 

Leu Val Ser Lys Lys Val Ser Qly Qln Gly Arg Asp Qln Ala Leu 

290 295 300 

Asn Leu Leu Asn Lys Asn Val Pro Arg Lys Asp Leu Ala He His 

305 310 315 

Asp Asn Ser Arg Thr He Tyr Val Val Asp Asn Qly Leu Arg Lys 

320 325 330 

He Leu Lys Val Val Gly Qln Val Pro Asp Leu Pro Ser Cys Leu 

335 340 345 

Pro Leu Thr Asp Asn Thr Arg Met Leu Ala Ser lie Leu He Asn 

350 355 360 

Lys Leu Tyr Asp Asp Leu Arg Cys Asp Pro Qlu Arg Asp His Phe 

365 370 375 

Arg Lys He Cys Glu Glu Tyr He Thr Qly Lys Phe Asp Pro Qln 

380 385 390 

Asp Met Asp Lys Asn Leu Asn Ala He Gin Thr Val Ser Gly He 

395 400 405 

Leu Gin Qly Pro Phe Asp Leu Qly Asn Qln Leu Leu Qly Leu Lys 

410 415 420 

Qly Val Met Qlu Met Met Val Ala Leu Cys Qly Ser Glu Arg Qlu 

425 430 435 

Thr Asp Qln Leu Val Ala Val Qlu Ala Leu He His Ala Ser Thr 

440 445 450 

Lys Leu Ser Arg Ala Thr Phe He He Thr Asn Qly Val Ser Leu 

455 460 465 

Leu Lys Qln He Tyr Lys Thr Thr Lys Asn Qlu Lys He Lys He 

470 475 480 

Arg Thr Leu Val Qly Leu Cys Lys Leu Gly Ser Ala Gly Gly Thr 

485 490 495 

Asp Tyr Gly Leu Arg Gin Phe Ala Glu Gly Ser Thr Glu Lys Leu 

500 505 510 

Ala Lys Gin Cys Arg Lys Trp Leu Cys Asn Met Ser He Asp Thr 

515 520 525 

Arg Thr Arg Arg Trp Ala Val Glu Gly Leu Ala Tyr Leu Thr Leu 

530 535 540 

Asp Ala Asp Val Lys Asp Asp Phe Val Gin Asp Val Pro Ala Leu 

545 550 555 

Qln Ala Met Phe Qlu Leu Ala Lys Thr Ser Asp Lys Thr He Leu 

560 565 570 

Tyr Ser Val Ala Thr Thr Leu Val Asn Cys Thr Asn Ser Tyr Asp 

575 580 585 

Val Lys Glu Val He Pro Glu Leu Val Qln Leu Ala Lys Phe Ser 

590 595 600 

Lys Qln His Val Pro Qlu Qlu His Pro Lys Asp Lys Lys Asp Phe 

605 610 615 

He Asp Met Arg Val Lys Arg Leu Leu Lys Ala Qly Val He Ser 

620 625 630 
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Ala Leu Ala Cys Met Val Lys Ala. Asp 
635 

Gin Thr Lys Glu Leu Leu Ala Arg Val 

eso 

Asn Pro Lys Asp Arg Gly Thr lie Val 

665 

Ala Leu lie Pro Leu Ala Leu Glu Gly 
680 

Lys Ala Ala His Ala Leu Ala Lys lie 

695 

Asp lie Ala Phe Pro Gly Glu Arg Val 
710 

Leu Val Arg Leu Leu Asp Thr Gin Arg 
725 

Glu Ala Leu Leu Gly Leu Thr Asn Leu 
740 

Leu Arg Gin Lys lie Phe Lys Glu Arg 
755 

Asn Tyr Met Phe Glu Asn His Asp Qln 
770 

Glu Cys Met Cys Asn Met Val Leu His 
785 

Phe Leu Ala Asp Gly Asn Asp Arg Leu 

800 

cys Gly Glu Asp Asp Asp Lys Val Qln 
815 

Leu Ala Met Leu Thr Ala Ala His Lys 

830 

Thr Gin Val Thr Thr Gin Trp Leu Glu 
845 

Leu His Asp Gin Leu Ser Val Gin His 
860 

Tyr Asn Leu Leu Ala Ala Asp Ala Glu 
875 

Glu Ser Glu Leu Leu Glu lie Leu Thr 
890 

Pro Asp Glu Lys Lys Ala Glu Val Val 
905 

Leu lie Lys Cys Met Asp Tyr Gly Phe 
920 

<210> 14 

<211> 530 

<212> PRT 

<213> Homo sapiens 

<220> 

<221> misc_feature 

<223> Incyte ID No: 1234259CD1 

<400> 14 

Met Met Ser Glu His Asp Leu Ala Asp 

1 5 
Glu Asp Leu Ser Pro Asp His Pro Val 
20 

Val Thr Asp Glu Asp Glu Pro Ala Leu 
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Ser 


Ala lie Leu Thr 


Asp 


640 




645 


Phe 


Leu Ala Leu Cys 


Asp 


655 




660 


Ala 


Gin Qly Gly Gly 


Lys 


670 




675 


Thr 


Asp Val Gly Lys 


Val 


685 




690 


Ala 


Ala Val Ser Asn 


Pro 


700 




705 


Tvr 


Glu Val Val Arg 


Pro 


715 




720 


Asp 


Qly Leu Gin Asn 


Tvr 


730 




735 


Ser 


Gly Arg Ser Asp 


Lvs 


745 




750 


Ala 


Leu Pro Asp Jle 


Glu 


760 




765 


Leu 


Arg Gin Ala Ala 


Thr 


775 




780 


Lvs 


Glu Val Gin Glu 


Arg 


790 




795 




Leu Val Val Leu 


Leu 


805 




810 


Asn 


Ala Ala Ala Gly 


Ala 


820 




825 


Lys 


Leu Cys Leu Lys 


Met 


835 




840 


lie 


Leu Gin Arg Leu 


Cvs 


850 




855 


Ara 


Gly Leu Val lie 


Ala 


865 




870 


Leu 


Ala Lys Lys Leu 


Val 


880 




885 


Val 


Val Gly Lys Gin 


Glu 


895 




900 


Gin 


Thr Ala Arg Glu 


Cys 


910 




915 


lie 


Lys Pro Val Ser 




925 






Val 


Val Gin He Ala 


Val 


10 




15 


Val 


Leu Glu Asn His 


Val 


25 




30 


Lys 


Arg Gin Arg Leu 


Glu 
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35 


40 






45 


lie 


Asn 


CvB Gin 


Asp 


Pro Ser He Lys Ser 


Phe 


Leu Tyr Ser 


He 








50 


55 






60 


Asn 


Gin 


Thr He 


Cys 


Leu Arg Leu Asp Ser 


He 


Glu Ala Lys 


Leu 








65 


70 






75 


Qln 


Ala 


TjOU Qlu 

X'OU VJf «^ wk 


Ala 


Thr Cys Lys Ser Leu 


Glu 


Glu Lys Leu 


Aso 








80 


85 






90 


Leu 


Val 


Thr Asn 


Lys 


Gin His Ser Pro He 


Gin 


Val Pro Met 


Val 








95 


100 






105 


Ala Qly 


Ser Pro 


Leu 


Qlv Ala Thr* Gin Thr 

WXJ^ nXO A &AX \7X** XiAX 


Cys 


Asn Lys Val 


Arg 








110 


115 






120 


Cys 


Val 


Val Pro 


Gin 


Thr* Thr Val Tie Leu 

JkAlX XllX VCIX XXC UCU 


Asn 


Asn. AsD Arc 


Gin 








125 


130 






135 


Asn 


Ala 


Tl o Vnl 
xxo vcix 


Axel 


iJjf a ITlCU wXU As^ irxu 


Leu 


J?^T" A^n AT"rr 

OCX AX VJ 


Ala 








140 


145 






150 


Pro 


Asp 






AslXl VAX XXts OCsX AEvXA 


Ala 


Val Pro Qlv 

VAX XX W WXJ^ 


Am 
Axy 










160 






165 


Arg Oln 


A tan Tin T* 


XX c 


Val Val T.vs Val Pto 
vax vcix ujf<3 vcix irxu 


Qlv 


Gl n Gl 11 Afin 

^7X*A \9XU CKO^ 


Ser 








170 


175 






180 


His 


His 


11 A no 
wxU nB>^ 


Gly 


D1 11 R^ir a')\r SeT* f?1 ti 

V3XU OCX ^Xjf 0CX VJXU 


Ala 


Ser Asn Se5* 

0CX t%S9Jgf 


Val 








185 

XO «J 


190 






195 


Ser 


Ser 


Cys Gly 


Gin 


Ala Glv S©^ Oln Seir 

nXCl ^7Xj^ 0CX V7X*4 0CX 


He 


Qlv Ser Asn 

XJJLmJ liJ^3X 


Val 








200 


205 






210 


Thr 


Leu 




Leu 


Atari Qor* ftlii rtl ii Actn 

Ami OCX \7XU V7XU AO^ 




Pi*o Asn Qlv 

XXW flOll WXJ^ 


Thr 








215 


220 






225 


Trp 


Leu 




Glu 


Asn Ann G^ii 

ACSfl ASXX Ii^%J wXU lICw 


Arg 


Val Aircf Cvs 

V c( X f&x y ^ jr 


Ala 








230 


235 






240 


lie 


lie 


9iro Scat* 

OCX 




Mefc Ijeu His Tie Ser 
ricu xicu nxo xxc ocx 


Thr 


Asn Cva Ara 


Thr 








245 


250 






255 


Ala 


Glu 


T.\ria Mot' 


AXa 


T.ckii 'Plif T.eii Aisn 

iJCU i. IIX. iJCU JJtSU AB^ 




Leu Phe His 

ucu JTXAC nxo 


Arg 








^ V V 


265 






270 


Qlu 


Val 


rtl n 211 a 
VJ-Ln JnkXci 


vax 


OCX Asii ucU OCX \jA.y 


Gl Tk 
wXXl 




Glv 








975 


280 






285 


Lys 


Lys 


UC3U 


Asp 


^■*"o Ti^ti 'PVit" Tift TH/T" 
f xu ucu xiix xxc Ajfx 


Glv 


Tie Am Pvq 

xxc flX \J ^ JT ^ 


His 








^ 7 u 


995 






300 


Leu 


Phe 






VjrXjf XXc XXlx ^9XU ocx 


Asp 


AX^ XjfX AXg 


Tie 

xxc 








^ vl ^ 


JXU 






315 

•J X —1 


Lys 


Gin 


Cav Tl e 

•ber XJ.e 


Asp 


oer JLiys cys Axg^ mx 


Al a 

AX A 


*Pyf* Ai^rr AT"rT 
X X ^ Ax y AX y 


Lys 








■^90 


325 






330 


Gin 


Arg 




Ser 


T.A11 'Ala T^al T.vfa QoT* 
XlcU AXC» VAX JLtyfS OCX 


Phe 


Am Jkirrt 
OCX AX^ Axy 


Thr 








335 


340 






345 


Pro 


Asn 


OCX OCX 


•9 CSX 


TV/T* ("*\7ci Pt*o 55fty" tl 
jLj^x ^jfS* ir^^J OCX wxu 


Pro 


Met: Met: Ser 


Thr 








350 


355 






360 


Pro 


Pro 


ZrxVi^ AXO 


oox 


QT \i Tkftu Pt*** Qln Piro 
oxu Ajcu jrxu wxii f xv^ 


Qln 


Pro Qln Pro 


Qln 








365 


370 






375 


Ala 


Leu 


His TVr 


Ala 


T.011 Ala Aian Ala .Gin 

XICU AX A AdIA ax a wXll 


Gin 


Val Gin He 


His 








380 


385 






390 


Gin 


He 


Qly Qlu 


Asp 


Glv Gin Val Qln Val 

wXjf VIXIJ VAX wXlj V nx 


He 


Pro Qln Qly 


His 








395 


400 






405 


Leu 


His 


He Ala 


Gin 


Val Pro Gin Gly Glu 


Gin 


Val Gin He 


Thr 








410 


415 






420 


Gin 


Asp 


Ser Olu 


Gly 


Asn Leu Qln He His 


His 


Val Gly Gin 


Asp 








425 


430 






435 


Gly Qln 


Leu Leu 


Glu 


Ala Thr Arg He Pro 


Cys 


Leu Leu Ala 


Pro 








440 


445 






450 


Ser Val 


Phe Lys 


Ala 


Ser Ser Gly Qln Val 


Leu 


Gin Gly Ala 


Qln 
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455 460 465 

Leu lie Ala Val Ala Ser Ser Asp Pro Ala Ala Ala Qly Val Asp 

470 475 480 

Gly Ser Pro Leu Gin Gly Ser Asp lie Oln Val Gin Tyr Val Gin 

485 490 495 

Leu Ala Pro Val Ser Asp His Thr Ala Gly Ala Gin Thr Ala Glu- 

500 505 510 

Ala Leu Gin Pro Thr Leu Gin Pro Qlu Met Gin Leu Glu His Gly 

515 520 525 

Ala He Gin He Gin 

530 

<210> 15 

<211> 821 

<212> PRT 

<213> Homo sapiens 



<220> 

<221> mis cofeature 

<223> Xncyte ID Ko: 1440608CD1 



<400> 15 



Met 


Ala Lys 


Phe 


Ala 


Leu 


Asn 


n Asn Ti@ii 

tXf9H XJwU 


Pt'o Asn Xjeu Glv Qlv 










5 






10 


15 


Pro 


Arg 


Leu 


V-»Jf o 


Pro 


Val 


Pro 


Ala Ala Qlv 

ntjun w^jf 


Glv Ala ArcF Ser Pro 










20 






25 


30 


Ser 


Ser 


Pro 


Tyr 


Ser 


V dX 


Urxu 


Xllt JTxU XjfX^ 


Vji Jr X Al w XJCU nE>^ 










35 






40 


45 


Leu 


Asp 


Phe 


Leu 


Lys 


Tyr 


He 


Glu Glu Leu 


Glu Arg Gly Pro Ala 










50 






55 


60 


Ala 


Arg 


Arg 


Ala 


Pro 


Qly 


Pro 


Pro Thr Ser 


Arg Arg Pro Arg Ala 










65 






70 


75 


Pro 


Arg 


Pro 


Qly 


Leu 


Ala 


Gly 


Ala Arg Ser 


Pro Gly Ala Trp Thr 










80 






85 


90 


Ser 


Ser Glu 


Ser 


Leu 


Ala 


Ser 


Asp Asp Gly 


Gly Ala Pro Gly He 










95 






100 


105 


Leu 


Ser 


Gin 


Gly 


Ala 


Pro 


Ser 


Gly Leu Leu 


Met Gin Pro Leu Ser 










110 






115 


120 


Pro 


Arg 


Ala 


Pro 


Val 


Arg 


Asn 


Pro Arg Val 


Glu His Thr Leu Arg 










125 






130 


135 


Glu 


Thr 


Ser 


Arg 


Arg 


Leu 


Glu 


Leu Ala Gin 


Thr His Glu Arg Ala 










140 






145 


150 


Pro 


Ser 


Pro 


Gly 


Arg 


Gly 


Val 


Pro Arg Ser 


Pro Arg Gly Ser Gly 










155 






160 


165 


Arg 


Ser 


Ser 


Pro 


Ala 


Pro 


Asn 


Leu Ala Pro 


Ala Ser Pro Qly Pro 










170 






175 


180 


Ala 


Gin 


Leu 


Gin 


Leu 


Val 


Arg 


Glu Gin Met 


Ala Ala Ala Leu Arg 










185 






190 


195 


Arg 


Leu Arg 


Glu 


Leu 


Glu 


Asp 


Gin Ala Arg 


Thr Leu Pro Qlu Leu 










200 






205 


210 


Gin 


Glu 


Gin 


Val 


Arg 


Ala 


Leu 


Arg Ala Glu 


Lys Ala Arg Leu Leu 










215 






220 


225 


Ala 


Qly Arg 


Ala 


Gin 


Pro 


Glu 


Pro Asp Gly 


Glu Ala Glu Thr Arg 










230 






. 235 


240 


Pro 


Asp 


Lys 


Leu 


Ala 


Gin 


Leu 


Arg Arg Leu 


Thr Glu Arg Leu Ala 










245 






250 


255 
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Thr Ser Glu Arg Gly Gly Arg Ala Arg Ala Ser Pro Arg Ala Asp 

260 265 270 

Ser Pro Asp Gly Leu Ala Ala Gly Arg Ser Glu Gly Ala Leu Gin 

275 280 285 

Val Leu Asp Gly Glu Val Gly Ser Leu Asp Gly Thr Pro Gin Thr 

290 295 300 

Arg Glu Val Ala Ala Glu Ala Val Pro Glu Thr Arg Glu Ala Gly 

305 310 315 

Ala Gin Ala Val Pro Glu Thr Arg Glu Ala Gly Val Glu Ala Ala 

320 325 330 

Pro Glu Thr Val Glu Ala Asp Ala Trp Val Thr Glu Ala Leu Leu 

335 340 345 

Gly Leu Pro Ala Ala Ala Glu Arg Glu Leu Glu Leu Leu Arg Ala 

350 355 360 

Ser Leu Glu His Gin Arg Gly Val Ser Glu Leu Leu Arg Gly Arg 

365 370 375 

Leu Arg Glu Leu Glu Glu Ala Arg Glu Ala Ala Glu Glu Ala Ala 

380 385 390 

Ala Gly Ala Arg Ala Gin Leu Arg Glu Ala Thr Thr Gin Thr Pro 

395 400 405 

Trp Ser Cys Ala Glu Lys Ala Ala Gin Thr Glu Ser Pro Ala Glu 

410 415 420 

Ala Pro Ser Leu Thr Gin Glu Ser Ser Pro Gly Ser Met Asp Gly 

425 430 435 

Asp Arg Ala Val Ala Pro Ala Qly He Leu Lys Ser He Met Lys 

440 445 450 

Lys Arg Asp Gly Thr Pro Gly Ala Gin Pro Ser Ser Gly Pro Lys 

455 460 465 

Ser Leu Gin Phe Val Gly Val Leu Asn Gly Glu Tyx Glu Ser Ser 

470 475 480 

Ser Ser Glu Asp Ala Ser Asp Ser Asp Gly Asp Ser Glu Asn Gly 

485 490 495 

Gly Ala Glu Pro Pro Qly Ser Ser Ser Gly Ser Gly Asp Asp Ser 

500 505 510 

Gly Gly Qly Ser Asp Ser Gly Thr Pro Gly Pro Pro Ser Gly Gly 

515 520 525 

Asp He Arg Asp Pro Glu Pro Glu Ala Glu Ala Glu Pro Gin Gin 

530 535 540 

Val Ala Gin Gly Arg Cys Glu Leu Ser Pro Arg Leu Arg Glu Ala 

545 550 555 

Cys Val Ala Leu Gin Arg Gin Leu Ser Arg Pro Arg Gly Val Ala 

560 565 570 

Ser Asp Gly Gly Ala Val Aarg Leu Val Ala Gin Glu Trp Phe Arg 

575 580 585 

Val Ser Ser Gin Arg Arg Ser Gin Ala Glu Pro Val Ala Arg Met 

590 . 595 600 

Leu Glu Gly Val Arg Arg Leu Gly Pro Glu Leu Leu Ala His Val 

605 610 • 615 

Val Asn Leu Ala Asp Gly Asn Gly Asn Thr Ala Leu His Tyr Ser 

620 625 630 

Val Ser His Gly Asn Leu Ala He Ala Ser Leu Leu Leu Asp Thr 

635 640 645 

Gly Ala Cys Glu Val Asn Arg Gin Asn Arg Ala Gly Tyr Ser Ala 

650 655 660 

Leu Met Leu Ala Ala Leu Thr Ser Val Arg Gin Glu Glu Glu Asp 

665 670 675 
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Met Ala Val Val Gin Arg Leu Phe Cys Met Gly Asp Val Asn Ala 

680 685 690 

Lys Ala Ser Gin Thr Gly Gin Thr Ala Leu Met Leu Ala lie Ser 

695 700 705 

His Gly Arg Gin Asp Met Val Ala Thr Leu Leu Ala Cys Gly Ala 

710 - 715 720 

Asp Val Asn Ala Gin Asp Ala Asp Gly Ala Thr Ala Leu Met Cys 

725 730 735 

Ala Ser Glu Tyr Gly Arg Leu Asp Thr Val Arg Leu Leu Leu Thr 

740 745 750 

Gin Pro Gly Cys Asp Pro Ala He Leu Asp Asn Glu Gly Thr Ser 

755 760 765 . 

Ala Leu Ala He Ala Leu Glu Ala Glu Gin Asp Glu Val Ala Ala 

770 775 780 

Leu Leu His Ala His Leu Ser Ser Gly Gin Pro Asp Thr Gin Ser 

785 790 .795 

Glu Ser Pro Pro Gly Ser Gin Thr Ala Thr Pro Gly Glu Gly Glu 

800 805 810 

Cys Gly Asp Asn Gly Glu Asn Pro Gin Val Gin 

815 820 

<210> 16 

<211> 1003 

<212> PRT 

<213> Homo sapiens 

<220> 

<221> misc^feature 

<223> Xncyte ID No: 3413610CD1 

<400> 16 

Met Ala Arg Arg Gly Lys Lys Pro Val Val Arg Thr Leu Glu Asp 
15 10 15 

Leu Thr Leu Asp Ser Gly 'Tyr Gly Gly Ala Ala Asp Ser Val Arg 
20 25 30 

Ser Ser Asn Leu Ser Leu Cys Cys Ser Asp Ser His Pro Ala Ser 
35 40 45 

Pro Tyr Gly Gly Ser Cys Trp Pro Pro Leu Ala Asp Ser Met His 
50 . 55 60 

Ser Arg His Asn Ser Phe Asp Thr Val Asn Thr Ala Leu Val Glu 
65 70 75 

Asp Ser Glu Gly Leu Asp Cys Ala Gly Gin His Cys Ser Arg Leu 
80 85 90 

Leu Pro Asp Leu Asp Glu Val Pro Trp Thr Leu Gin Glu Leu Glu' 
95 100 105 

Ala Leu Leu Leu Arg Ser Arg Asp Pro Arg Ala Gly Pro Ala Val . 

110 115 120 

Pro Gly Gly Leu Pro Lys Asp Ala Leu Ala Lys Leu Ser Thr Leu 
125 130 135 

Val Ser Arg Ala Leu Val Arg He Ala Lys Glu Ala Gin Arg Leu 
140 145 150 

Ser Leu Arg Phe Ala Lys Cys Thr Lys Tyr Glu He Gin Ser Ala 
155 160 165 

Met Glu He Val Leu Ser Trp Gly Leu Ala Ala His Cys Thr Ala 
170 175 180 

Ala Ala Leu Ala Ala Leu Ser Leu Tyr Asn Met Ser Ser Ala Gly 
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185 190 195 

Qly Asp Arg Leu Gly Arg Gly Lys Ser Ala Arg Cys Gly Leu Thr 

200 205 210 

Phe Ser Val Gly Arg Val Tyr Arg Trp Met Val Asp Ser Arg Val 

215 220 225 

Ala. Leu Arg lie His Glu His Ala Ala lie Tyr Leu Thr Ala Cys 

230 235 240 

Met Glu Ser Leu Phe Arg Asp lie Tyr Ser Arg Val Val Ala Ser 

245 250 255 

Gly Val Pro Arg Ser Cys Ser Gly Pro Gly Ser Gly Ser Gly Ser 

260 265 270 

Gly Pro Gly Pro Ser Ser Gly Pro Gly Ala Ala Pro Ala Ala Asp 

275 280 285 

Lys Glu Arg Glu Ala Pro Gly Gly Gly Ala Ala Ser Gly Gly Ala 

290 295 300 

Cys Ser Ala Ala Ser Ser Ala Ser Gly Gly Ser Ser Cys Cys Ala 

305 310 315 

Pro Pro Ala Ala Ala Ala Ala Ala Val Pro Pro Thr Ala Ala Ala 

320 , 325 330 ^ 

Asn His His His His His His His Ala Leu His Glu Ala Pro Lys 

335 340 345 

Phe Thr Val Glu Thr Leu Glu His Thr Val Asn Asn Asp Ser Glu 

350 355 360 

He Trp Gly Leu Leu Gin Pro Tyr Gin His Leu He Cys Gly Lys 

365 370 375 

Asn Ala Ser Gly Asp Leu Val Ser Arg Ala Met His His Leu Gin 

380 385 390 

Pro Leu Gin Val Glu Arg Pro Phe Leu Val Leu Pro Pro Leu Met 

395 400 405 

Glu Trp He Arg Val Ala Val Ala His Ala Gly His Arg Arg Ser 

410 415 420 

Phe Ser Met. Asp Ser Asp Asp Val Arg Gin Ala Ala Arg Leu Leu 

425 430 435 

Leu Pro Gly Val Asp Cys Glu Pro Arg Gin Leu Arg Ala Asp Asp 

440 445 450 

Cys Phe Cys Ala Ser Arg Lys Leu Asp Ala Val Ala He Glu Ala 

455 460 465 

Lys Phe Lys Gin Asp Leu Gly Phe Arg Met Leu Asn Cys Gly Arg 

470 475 480 

Thr Asp Leu Val Lys Gin Ala Val Ser Leu Leu Gly Pro Asp Gly 

485 490 495 

He Asn Thr Met Ser Glu Gin Gly Met Thr Pro Leu Met Tyr Ala 

500 505 510 

Cys Val Arg Gly Asp Glu Ala Met Val Gin Met Leu Leu Asp Ala 

515 520 525 

Gly Ala Asp Leu Asn Val Glu Val Val Ser Thr Pro His Lys Tyr 

530 535 540 

Pro Ser Val His Pro Glu Thr Arg His Trp Thr Ala Leu Thr Phe 

545 550 555 

Ala Val Leu His Gly His He Pro Val Val Gin Leu Leu Leu Asp 

560 565 570 

Ala Gly Ala Lys Val Glu Gly Ser Val Glu His Gly Glu Glu Asn 

575 580 585 

Tyr Ser Glu Thr Pro Leu Gin Leu Ala Ala Ala Val Gly Asn Phe 

590 595 600 

Qlu Leu Val Ser Leu Leu Leu Glu Arg Gly Ala Asp Pro Leu He 
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605 610 615 

Gly Thr Met Tyr Arg Asn Gly lie Ser Thr Thr Pro Gin Qly Asp 

620 625 630 

Met Asn Ser Phe Ser Oln Ala Ala Ala His Gly His Arg Asn Val 

635 640 645 

Phe Arg Lys Leu Leu Ala Oln Pro Glu Lys 'Glu Lys Ser Asp lie 

650 655 660 

Leu Ser Leu Glu Glu lie Leu Ala Glu Gly Thr Asp Leu Ala Glu 

665 670 675 

Thr Ala Pro Pro Pro Leu Cys Ala Ser Arg Asn Ser Lys Ala Lys 

680 685 690 

Leu Arg Ala Leu Arg Glu Ala Met Tyr His Ser Ala Glu His Gly 

695 700 705 

Tyr Val Asp Val Thr He Asp' He Arg Ser He Gly Val Pro Trp 

710 715 720 

Thr Leu His Thr Trp Leu Glu Ser Leu Arg He Ala Phe Gin Gin 

725 730 735 

His Arg Arg Pro Leu He Gin Cys Leu Leu Lys Glu Phe Lys Thr 

740 745 750 

He Gin Glu Glu Glu Tyr Thr Glu Glu Leu Val Thr Gin Qly Leu 

755 760 765 

Pro Leu Met Phe Qlu He Leu Lys Ala Ser Lys Asn Glu Val He 

770 775 780 

Ser Gin Gin Leu Cys Val He Phe Thr His Cys Tyr Gly Pro Tyr 

785 790 795 

Pro He Pro Lys Leu Thr Glu He Lys Arg Lys Gin Thr Ser Arg 

800 805 810 

Leu Asp Pro His Phe Leu Asn Asn Lys Glu Met Ser Asp Val Thr 

815 820 825 

Phe Leu Val Glu Gly Arg Pro Phe Tyr Ala His Lys Val Leu Leu 

830 835 840 

Phe Thr Ala Ser Pro Arg Phe Lys Ala Leu Leu Ser Ser Lys Pro 

845 850 855 

Thr Asn Asp Gly Thr Cys He Glu He Gly Tyr Val Lys Tyr Ser 

860 865 870 

He Phe Gin Leu Val Met Gin Tyr Leu Tyr Tyr Gly Gly Pro Glu 

875 • 880 885 

Ser Leu Leu He Lys Asn Asn Glu He Met Glu Leu Leu Ser Ala 

890 895 900 

Ala Lys Phe Phe Gin Leu Glu Ala Leu Gin Arg His Cys Glu He 

905 910 915 

He Cys Ala Lys Ser He Asn Thr Asp Asn Cys Val Asp He Tyr 

920 925 930 

Asn His Ala Lys Phe Leu Gly Val Thr Glu Leu Ser Ala Tyr Cys 

935 940 945 

Glu Gly Tyr Phe Leu Lys Asn Met Met Val Leu He Glu Asn Glu 

950 955 960 

Ala Phe Lys Oln Leu Leu Tyr Asp Lys Asn Gly Glu Gly Thr Gly 

965 970 975 

Gin Asp Val Leu Gin Asp Leu Gin Arg Thr Leu Ala He Arg He 

980 985 990 

Gin Ser He His Leu Ser Ser Ser Lys Gly Ser Val Val 

995 1000 

<210> 17 
<211> 888 
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<212> PRT 

<213> Homo sapiens 

<220> 

<221> inisc_feature 

<223> Incyte ID No: .3276394CD1 

<400> 17 

Met Asp Olu Ser Ala Leu Leu Asp 

1 5 
Leu Qlu Arg Leu Asp Ala Ser Ala 
20 

Thr Phe Cys Lys Arg Cys Leu Leu 
35 

Glu Leu Arg Cys Pro Qlu Cys Arg 
50 

Glu Glu Leu Pro Ser Asn lie Leu 
65 

lie Lys Gin Arg Pro Trp Lys Pro 
80 

Thr Asn Cys Thr Asn Ala Leu Arg 
95 

Asn Cys Ser Ser Lys Asp Leu Gin 

110 

Pro Arg Val Gin Ser Trp Ser Pro 
125 

Leu Pro Cys Ala Lys Ala Leu Tyr 

140 

Gly Asp Leu Lys Phe Ser Lys Gly 
155 

Gin Val Asp Qlu Asn Trp Tyr His 
170 

Gly Phe Phe Pro Thr Asn Phe Val 
185 

Gin -Pro Pro Ser Gin Cys Lys Ala 

200 

Asp Lys Qlu Ala Asp Lys Asp Cys 
215 

Val Leu Thr Val lie Arg Arg Val 
230 

Met Leu Ala Asp Lys lie Gly lie 

245 

Phe Asn Ser Ala Ala Lys Gin Leu 
260 

Val Pro Gly Val Asp Ala Gly Glu 

275 

Ser Ser Thr Ala Pro Lys His Ser 
290 

Lys Arg His Ser Phe Thr Ser Leu 
305 

Gin Ala Ser Gin Asn Arg His Ser 
320 

Leu lie Ser Ser Ser Asn Pro Thr 
335 

Leu Ser Gly Leu Ser Cys Ser Ala 
350 







Leu Leu Glu Cys 


Pro Val Cys 


10 


15 


Lys Val Leu Pro 


Cys Gin nis 


25 


30 


Gly lie Val Gly 


Ser Arg Asn 


40 


45 


Thr Leu Val Gly 


Ser Gly Val 


55 


60 


Leu Val Arg Leu 


Leu Asp Gly 


. 70 


75 


Gly Pro Gly Gly 


Gly Ser Gly 


85 


90 


Ser Gin Ser Ser 


Thr Val Ala 


100 


105 


Ser Ser Gin Gly 


Gly Gin Qln 


115 


120 


Pro Val Arg Gly 


He Pro Gin 


130 


135 


Asn Tyr Glu Gly 


Lys Glu Pro 


145 


150 


Asp lie lie He 


Leu Arg Arg 


160 


165 


Gly Glu Val Asn 


Gly He His 


175 


180 


Gin He He Lys 


Pro Leu Pro 


190 


195 


Leu Tyr Asp Phe 


Glu Val Lys 


205 


210 


Leu Pro Phe Ala 


Lys Asp Asp 


220 


225 


Asp Glu Asn Trp 


Ala Glu Gly 


235 


240 


Pne Pro He Ser 


Tyr Val Qlu 




o c c 


He Glu Trp Asp 


Lys Pro Pro 


265 


270 


Cys Ser Ser Ala 


Ala Ala Gin 


280 


285 


Asp Thr Lys Lys 


Asn Thr Lys 


295 


300 


Thr Met Ala Asn 


Lys Ser Ser 


310 


315 


Met Glu He Ser 


Pro Pro Val 


325 


330 


Ala Ala Ala Arg 


He Ser Glu 


340 


345 


Pro Ser Gin Val 


His He Ser 


355 


360 
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Thr Thr Gly Leu lie Val Thr Pro Pro Pro Ser Ser Pro Val Thr 

365 370 375 

Thr Gly Pro Ser Phe Thr Phe Pro Ser Asp Val Pro Tyr Gin Ala 

380 385 390 

Ala Leu Gly Thr Leu Asn Pro Pro Leu Pro Pro Pro Pro Leu Leu 

395 400 405 

Ala Ala Thr Val Leu Ala Ser Thr Pro Pro Gly Ala Thr Ala Ala 

410 415 420 

Ala Ala Ala Ala Oly Met Gly Pro Arg Pro Met Ala Gly Ser Thr 

425 430 435 

Asp Gin lie Ala His Leu Arg Pro Gin Thr Arg Pro Ser Val Tyr 

440 445 450 

Val Ala He Tyr Pro Tyr Thr Pro Arg Lys Glu Asp Glu Leu Glu 

455 • 460 465 

Leu Arg Lys Gly Glu Met Phe Leu Val Phe Glu Arg Cys Gin Asp 

470 475 .480 

Gly Trp Phe Lys Gly Thr Ser Met His Thr Ser Lys He Gly Val 

485 490 495 

Phe Pro Gly Asn Tyr Val Ala Pro Val Thr Arg Ala Val Thr Asn 

500 505 510 

Ala Ser Gin Ala Lys Val Pro Met Ser Thr Ala Gly Gin Thr Ser 

515 520 525 

Arg Gly Val Thr Met Val Ser Pro Ser Thr Ala Gly Gly Pro Ala 

530 535 540 

Gin Lys Leu Gin Gly Asn Gly Val Ala Gly Ser Pro Ser Val Val 

545 550 555 

Pro Ala Ala Val Val Ser Ala Ala His He Gin Thr Ser Pro Gin 

560 565 570 

Ala Lys Val Leu Leu His Met Thr Gly Gin Met Thr Val Asn Gin 

575 580 585 

Ala Arg Ash Ala Val Arg Thr Val Ala Ala His Asn Gin Glu Arg 

590 595 600 

Pro Thr Ala Ala Val Thr Pro He Gin Val Gin Asn Ala Ala Gly 

605 610 615 

Leu Ser Pro Ala Ser Val Gly Leu Ser His His Ser Leu Ala Ser 

620 625 630 

Pro Gin Pro Ala Pro Leu Met Pro Gly Ser Ala Thr His Thr Ala 

635 640 645 

Ala He Ser He Ser Arg Ala Ser Ala Pro Leu Ala Cys Ala Ala 

650 655 660 

Ala Ala Pro Leu Thr Ser Pro Ser He Thr Ser Ala Ser Leu Glu 

665 670 675 

Ala Glu Pro Ser Gly Arg He Val Thr Val Leu Pro Gly Leu Pro 

680 685 690 

Thr Ser Pro Asp Ser Ala Ser Ser Ala Cys Gly Asn Ser Ser Ala 

695 700 705 

Thr Lys Pro Asp Lys Asp Ser Lys Lys Glu Lys Lys Gly Leu Leu 

710 715 720 

Lys Leu Leu Ser Gly Ala Ser Thr Lys Arg Lys Pro Arg Val Ser 

725 730 735 

Pro Pro Ala Ser Pro Thr Leu Glu Val Glu Leu Gly Ser Ala Glu 

740 745 750 

Leu Pro Leu Gin Gly Ala Val Gly Pro Glu Leu Pro Pro Gly Gly 

755 760 765 

Gly His Gly Arg Ala Gly Ser Cys Pro Val Asp Gly Asp Gly Pro 

770 775 780 
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Val Thr Thr Ala Val Ala Gly Ala Ala Leu Ala Gin Asp Ala Fhe 

785 790 795 
His Arg Lys Ala Ser Ser Leu Asp Ser Ala Val Pro He Ala Pro 

800 805 810 

Pro Pro Arg Gin Ala Cys Ser Ser Leu Gly Pro Val Leu Asn Glu 

- 815 820 825 

Ser Arg Pro Val Val Cys Glu Arg His Arg Val Val Val Ser Tyr 

830 835 840 

Pro Pro Gin Ser Glu Ala Glu Leu Glu Leu Lys Glu Gly Asp He 

845 850 855 

Val Phe Val His Lys Lys Arg Glu Asp Gly Trp Phe Lys Gly Thr 

860 865 870 

Leu Gin Arg Asn Gly Lys Thr Gly Leu Phe Pro Gly Ser Phe Val 

875 880 886 

Glu Asn He 



<210> 18 
<2H> 283 
<212> PRT 

<213> Homo sapiens 
<220> 

<221> misc_f eature 

<223> Incyte ID No: 7602049CD1 

<400> 18 

Met Ser Tyr Ser Val Thr Leu Thr Gly Pro Gly Pro Trp Gly Phe 
1 5 10 15 

Arg Leu Qln Gly Gly Lys Asp Phe Asn Met Pro Leu Thr He Ser 
20 25 30 

Arg He Thr Pro Gly Ser Lys Ala Ala Gin Ser Gin Leu Ser Gin 
35 40 45 

Gly Asp Leu Val Val Ala He Asp Gly Val Asn Thr Asp Thr Met 
50 55 60 

Thr His Leu Glu Ala Gin Asn Lys He Lys Ser Ala Ser Tyr Asn 
65 70 75 

Leu Ser Leu Thr Leu Gin Lys Ser Lys Arg Pro He Pro He Ser 
80 85 90 

Thr Thr Ala Pro Pro Val Gin Thr Pro Leu Pro Val He Pro His 
95 100 105 

Gin. Lys Val Val Val Asn Ser Pro Ala Asn Ala Asp Tyr Gin Glu 
110 115 120 

Arg Phe Asn Pro Ser Ala Leu Lys Asp Ser Ala Leu Ser Thr His 
125 130 135 

Lys Pro He Glu Val Lys Gly Leu Gly Gly Lys Ala Thr He He 
140 145 • 150 

His Ala Gin Tyr Asn Thr Pro He Ser Met Tyr Ser Gin Asp Ala 
155 160 165 

He Met Asp Ala He Ala Gly Gin Ala Gin Ala Gin Gly Ser Asp 
170 175 180 

Phe Ser Gly Ser Leu Pro He Lys Asp Leu Ala Val Asp Ser Ala 
185 190 195 

Ser Pro Val Tyr Qln Ala Val He Lys Ser Gin Asn Lys Pro Glu 
200 205 210 

Asp Glu Ala Asp Glu Trp Ala Arg Arg Ser Ser Asn Leu Gin Ser 
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215 220 225 

Arg Ser Phe Arg lie Leu Ala Gin Met Thr Qly Thr Glu Phe Met 

230 235 240 

Oln Asp Pro Asp Glu Glu Ala Leu Arg Arg Ser Arg Glu Arg Phe 

245 250 255 

Glu Thr Glu Arg Asn Ser Pro Arg Phe Ala Lys Leu Arg Asn Trp 

260 265 270 

His His Qly Leu Ser Ala Gin He Leu Asn Val Lys Ser 

275 280 

<210> 19 
<211> 1830 
<212> DNA 

<213> Homo sapiens 
<220> 

<221> itdsc_feature 

<223> Incyte ID No: 5566074CB1 

<400> 19 

atgtacacct tcgtggtacg cgatgagaac agcagcgtct acgccgaggt ctcccggctg 60 
ctcctcgcca ccggccactg gaagaggctg cggcgagaca accccagatt caacctgatg 120 
ctgggagaga ggaatcggct gcccttcggg agactgggtc acgagcccgg gctggtacag 180 
ttggtgaatt actacagggg tgctgacaaa ctgtgtcgca aagcttcttt agtgaagcta 240 
atcaagacaa gccctgaact ggctgagtcc tgcacatggt tccctgaatc ttatgtgatt 300 
tatccaacca atctcaagac tccagttgct ccagcacaga atggaattca gccaccaatc 360 
agtaactcaa ggacagatga aagagaattc tttctcgcct cttataacag aaagaaagag 420 
gatggagagg gcaacgtttg gattgcaaag tcatcagccg gtgccaaagg tgaaggcatt 480 
ctcatctcct cagaggcttc agagcttctc gatttcatag acaaccaggg ccaagtgcac 540 
gtgatccaga aatatcttga gcaccctctg ctgcttgagc caggtcatcg caagtttgac 600 
attcgaagct gggtcttggt ggatcatcag tataatatct acctctatag agagggtgtg 660 
cttcggactg cttcagaacc atatcatgtt gataatttcc aagacaaaac ctgccatttg 720 
accaatcact gcattcaaaa agagtattca aagaactacg ggaagtatga agaaggaaat 780 
gaaatgttct tcaaggagtt caatcagtac ctaacaagtg ctttgaacat taccctagaa 840 
agtagtatct tactacaaat caaacatata ataaggaact gcctcctgag cgtggagcct 900 
gccattagca ccaagcacct cccttaccag agcttccagc tcttcggctt tgacttcatg 960 
gtcgatgagg agctgaaggt gtggctcatt gaggtcaacg gtgcccctgc atgtgctcag 1020 
aagctctatg cagaactgtg ccaaggcatc gtggacatag ccatttccag tgtcttccca 1080 
cccccagatg tggagcaacc tcagacccag ccagctgcct tcatcaagct gtgacagagg 1140 
gcactccctg ctgccttgga aaaagcacgg ggtcctgctc cagggaatgg tgaaatgact 1200 
ggattgctct ttatccagcc cacagcaggg gaaagaaagg caactcgcaa agatgagatg 1260 
gaagaaggca cgtgagcaga ggaggcagct cccaaagaga gggctgctca gggggcttcc 1320 
caggtgtagc tctcagcagt gctgttgaga cttttgaaaa caactttggt acacaaaggc 1380 
agctttgtga gcagagctcc ttcccctctc cccgggaacg gcagggcact gggacctctg 1440 
gtcggtgcct cccacccact gcagccctag tgccttagct ccatgcccgg ctgcagcccc 1500 
actgctctgg actatggatt ggacgtcaga gcatattgga ggttgcctgt gtgttcccca 1560 
cccatccctt cggtaacact ctgccacact aagctctgta caagcatgca ccaacagtcc 1620 
ttagttttgt gctgtgcact ggcctctcgg caaaggtggt ttccctcatc accttcctga 1680 
tggtgtttgg tcagtcacct gtcagggttt gtgcgggttg ggccccaaaa cagcatatgc 1740 
tgctctaagt ctgctctctg catgttttag aaacaaagtg gcaagtctgc cctgaacctg 1800 
taagcatcaa ataagcatga gagagaaaaa 1830 

<210> 20 

<211> 2795 

<212> DNA 

<213> Homo sapiens 
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<220> 

<221> misc^f eature 
<223>.Incyte ID No: 5679814CB1 

<400> 20 

ggaaaaactc tctgctcgtc atcaaggcag catcatcatc gttattgatt ctatagatca 60 

agftcagcaa gttgaaaaac acatgaaatg gctgatagat ccactgccag tgaatgtaag 120 

agtaattgtt tctgtgaatg tagaaacatg ccctccagca tggaggttgt ggcctacact 180 

tcatcttgat cccttaagtc caaaagatgc aaaatctatt ataattgcag aatgccactc 240 

tgtagacatt aaattgagta aagagcagga gaagaagcta gaacgacact gtcgttctgc 300 

tacaacctgc aatgcccttt atgtcaccct tttcggcaaa atgatcgcgc gtgctgggag 360 

agcaggcaat ttagataaaa tccttcatca gtgtttccag tgtcaagata ctctttcatt 420 

atatagactt gttctgcact ctatccggga gtccatggca aatgatgtgg ataaagagct 480 

aatgaagcag atcctctgcc ttgtcaatgt tagtcacaat ggtgtgagtg aatcagaact 540 

gatggaactc tatcctgaga tgtcctggac tttcttgacc tcccttattc acagtttata 600 

caaaatgtgt ttgttgactt atggatgtgg cttgcttagg tttcaacatc tgcaggcttg 660 

ggaaacagtg agattggagt acctggaagg ccccactgtt acttcttcat acaggcaaaa 720 

gctaatcaac tatttcacct tgcagctaag tcaggacaga gtgacttgga gaagtgcaga 780 

tgaactcccg tggctttttc agcagcaggg aagtaaacag aagctgcatg attgccttct 840 

taatctcttt gtgtctcaaa acctttataa aaggggacac tttgctgagt tgctgagtta 900 

ttggcagttt gttggcaaag acaaaagtgc aatggcaaca gaatacttcg attcattgaa 960 
gcagtatgag aaaaactgcg aaggcgagga caacatgagt tgcttagctg atctttatga 1020 

aaccttgggg cgatttctca aggatctagg ccttctcagt caggccatag tacctttgca 1080 

gaggtcttta gagattcgag. aaacagcttt agatcccgat cacccaagag tagcccagtc 1140 

cctccaccaa ctagcaagtg tatacgtgca gtggaagaag tttggcaatg cagaacaact 1200 

gtataaacag gcgttggaaa tctcagaaaa tgcttatggt gcggaccatc catatactgc 1260 
tcgtgaactt gaagcacttg caactttgta ccagaaacaa aataaatatg aacaagctga 1320 

acattttagg aaaaaatcct ttaaaattca tcagaaagct ataaagaaaa aaggcaactt 1380 

gtacggattt gcccttttac gtagacgggc tttacagtta gaagagctta cattaggtaa 1440 
ggacacacct gataatgctc ggaccctcaa tgaactgggt gttctctact atcttcaaaa 1500 

taacctggaa acagctgacc agtttctgaa gcgttcctta gaaatgaggg agcgagttct 1560 

aggaccagat caccctgact gtgctcagtc tttgaataat ctggcagctc tatgcaatga 1620 

aaagaaacag tatgataaag cagaagaact ttatgaaaga gctttagata ttcggagacg 1680 

tgcattagct cctgatcacc cttctttggc atatacggtg aagcatcttg ccatcttgta 1740 

taagaaaatg gggaaacttg acaaagctgt acctttgtat gaattggctg ttgaaattcg 1800 

acagaaatct tttggcccaa agcaccctag tgtagctact gccttggtga acttagctgt 1860 

tctttatagc caaatgaaaa aacacgttga agctttgcca ttatatgaaa gagcattaaa 1920 

•gatttatgaa gatagcctgg gtcggatgca tcctcgagtt ggagaaacac tgaaaaattt 1980 

agctgtgctt agctatgaag gaggagattt tgaaaaagct gctgaattat acaaaagggc 2040 

aatggaaata aaagaagcag aaacatcact cttgggtgga aaagct'cctt cacgccattc 2100 

atcaagtgga gacacgttta gcttaaaaac agctcattct cctaatgttt tccttcagca 2160 

aggacaaagg taatagcagc agttagaatt ctttgcaaat gtaccttaag acaaaataat 2220 

taaacatttg gaacatttga atttgaaact ttaaaaaaat gttgtacgaa attttactac 2280 
gtgtgattta actgctattt gtatgaagtt gtattggatt acattaagtt ggaattgtga 2340 
ttatgtctgt tttagttgtt taaaagaatt ttcctattat atggtatcca aggatgtaga 2400 

cacattagaa ttataagaag acatgaggag caaatcatga agagcggatt ggtctttgtt 2460 
caacaagagc tggcagagta gttaagacaa ggagttcaaa aattccatga atcttggcca 2520 

ggcatggtgg ctcatgcctg taatcccacc actttgggag ggtgatgcag gaggatcact 2580 

tgaggccagg agttggagac cagcctggcc aacatggtga aaccctgtct ctactaaaaa 2640 

tacaaaaaaa aattagctgg acctggtggc gcatgcctgt aatcctagct actcaggagg 2700 
ctgaggcatg agaatcgctt gaactcagca agtggaggtt gctatgagct gagattgtgc 2760 

cagttcattc cacactgggc aacagggtga gctga 2795 

<210> 21 
<211> 4436 
<212> mk 
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<213> Homo sapiens 
<220> 

<221> misc_f eature 

<223> Incyte ID No: 7472735CB1 

<400> 21 

gccagatcgc cgcgcgaggg atggtgggca 
ccgccgtcgc ccaggatggg ctgggaatga 
gacccatctg aagtccatat ggctctgtat 
ctggctctgc aaaagtgccg ccctgacttg 
gtcttagtac cctgcaaagg aagcctgtcg 
tcctacattt tgatacctgt ggaagagcat 
attcaaggga acaggattaa attaggagct 
ctctttgaag aaactttcta caatgaaaaa 
catcctttgg aaaagagaga gagttcagaa 
ctgaaaacca ttgaagatgt gagagagttc 
aacatcgcct ctttccatcg aacattccga 
atagactcag cgaatgctct ctacaccaaa 
ctgaaaatgc tcgccaagca ggaggcccag 
tacgtccatc atgaaattta caacctgatc 
gaggatgcgg cctttaacaa aatcacaaga 
ggtgtgaaac cggagttcag ctttaacata 
aacaaatgca cctccccaca gcagaagctt 
acacagtctc caagccagag agtgaacctg 
gtcctgttat acttgcttgt gaaaacggag 
atcaaaaact tcaggtttag cagcttggca 
ttcgaagctg ccattgaata tattcggcaa 
gagggatttg gagacaggct gttccttaag 
tcgtctccca ccgactgcct gtttaagcac 
agacttctga gccaagagga ccatgataaa 
tgcttctgcg atgactgtga gaaactcgtc 
actccattct ccagagacga cagggggcac 
caggcatccc tcatcgacct cctggtttcc 
catggagcca ctccgctcca cctggcctgt 
ctgctgcact acaaggccag cgcggaagtg 
ctggcctgca cttacggcca cgaggactgt 
tcgtgcagac ttgacattgg caatgagaaa 
tggggctacc aaggcgtcat agagacattg 
aacagactga aggagacgcc cctcaagtgt 
gaagcctatc acctgtcctt cgagaggagg 
ccgcagcgct ccgtggactc catcagccaa 
tcagccagct caaggcagga ggagaccaag 
agagcagttg ctgatggaga tctagaaatg 
gacctggagg atgcggagga cactgtcagt 
tgccagtgcc ccaagtgtgc cccagctcag 
cttggtgtga acgtgaccag ccaggacggc 
ggccgggcgg acctcatccc cctcctgctg 
gcagaccaag ccgtcccgct ccacctggcc 
tgtctgttag attcgaatgc aaaacccaat 
atttacgcct gctccggtgg ccatcacgag 
tccattaacg cttctaacaa taagggcaac 
cacgtcttcg tggtagagct gcttctgctc 
cggcagcgca cggctgtaga ctgtgctgaa 
gtggtaccaa gctgtgttgc ttcattagat 
gtcactgtta agatcaggaa aaaatggaac 



tcgaggtccc agcagcggac gagggaggtg 60 
agcgatgtag ccttttaaga gatttgctct 120 
gatgaagacc tcctgaaaaa tcctttctat 180 
tgcagcaaag tggcccaaat ccatggcatt 240 
agcagcatcc agtctacttg tcagtttgag 300 
tttcagacct taaatggaaa ggatgtcttt 360 
ggttttgcct gtcttctctc agtgcccatt 420 
gaagagagtt tcagcatcct gtgtatagcc 480 
gagcctttgg caccctcaga tcccttttcc 540 
ttgggaagac actccgagcg atttgacagg 600 
gaatgcgaga gaaagagcct ccgtcaccac 660 
tgcctccagc agcttctgag ggactctcac 720 
atgaacctga tgaagcaggc agtggagata 780 
tttaaatacg tggggaccat ggaggcaagt 840 
agccttcaag atcttcagca gaaagatatt 900 
cctcgtgcca aaagagagct ggctcagctg 960 ■ 
gtctgcttgc gaaaagtggt gcagctcatt 1020 
gagaccatgt gtgctgatga tctgctatca 1080 
atccctaatt ggatggcaaa tttgagttac 1140 
aaggatgaac .tgggatactg cctgacctca 1200 
ggaagcctct ctgctaaacc ccctgagtct 1260 
cagagaatga gcttactctc tcagatgact 1320 
attgcatcag gtaaccagaa agaagtggag 1380 
gataccgtcc aaaagatgtg tcaccctctc 1440 
tctgggaggt tgaatgatcc ctcagttgtc 1500 
acccctctcc atgtggctgc tgtctgtggg 1560 
aagggcgcca tggtaaatgc cacagactac 1620 
cagaagggct accagagcgt gacgctgctg 1680 
caggacaaca atgggaatac gccactccac 1740 
gtgaaggctc tggtttacta cgacgtggag 1800 
ggagacaccc ctctacacat tgctgcccgc 1860 
ctgcagaacg gagcgtccac cgagatccag 1920 
gcattaaact caaagattct gtctgtaatg 1980 
cagaagtcgt ccgaggcccc tgtgcagtcc 2040 
gagtcctcca cttccagctt ctcctccatg 2100 
aaggactaca gagaggtaga aaaacttttg 2160 
gtgcgttacc tgttggaatg gacagaggag 2220 
gcagcggacc ccgaattctg tcacccgttg 2280 
aagaggctgg cgaaggttcc tgccagtggg 2340 
tcctccccgc tgcatgtcgc cgccctgcac 2400 
aagcacgggg ccaacgcagg tgccaggaac 2460 . 
tgccagcagg gccactttca ggtggtgaag 2520 
aagaaggacc tcagtggaaa cacgcccctc 2580 
cttgtggcac tgctgctaca gcacggggcc 2640 
acagcgctgc acgaggctgt gattgaaaag 2700 
cacggagcgt cagttcaggt gctgaacaag 2760 
cagaattcaa aaataatgga attgcttcag 2820 
gatgtggctg aaactgaccg caaggagtat 2880 
tcaaaactgt atgatctacc agatgagcct 2940 
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tttacaagac agttttactt tgtccactca gctggtcagt ttaagggaaa gacttcaagg 3000 
gagattatgg caagagatag aagtgtccct aatttaaccg aaggttcttt gcatgagcca 3060 
sggaggcaaa gtgtcacact gagacagaat aacctgccag ctcagagtgg atctcatgct 3120 
gctgagaaag gcaacagcga ctggccagag aggcctggac tgacacagac tggccctgga 3180 
cacagacgga tgctgcggag acacacggta gaggatgcgg tcgtgtccca gggcccggag 3240 
gctgctggcc ccctctccac tccccaagag gttagtgctt cccggtccta acaggaatga 3300 
ggagttgttg aacccactgc taggaagcaa ggatgcaaca agatgatgct gagcgtgaac 3360 
acatctgaga actaaatgtg cttccatgag actggcttga gaagtcttca gcaccaagtt 3420 
cctgaaagct tttctgtggc aggaaagaat gcaacaaaaa agttaaccac caccatctct 3480 
ctcctcttca aagctaatga atacaattga aacagacaaa aattccagta gcatccagat 3540 
ccttaagcca gaggtgcatg cttcttttta agtatgaggg tttgttggtc acagtgggag 3600 
aggtttcacc accgcattct gacctcctcc tcccaaaagg tgctaaacct ctctgacctg 3660 
tgtacattca caaaccacag ctagaattcc tccacctagg attaagctgg agagaagtaa 3720 
gtaatttagg tttcatggta ctgtagaggc caggctgaaa tgtcatatct gaaggaagaa 3780 
agcagcagct ggacaatgtt tctttgcaaa gcaacactcg aaccaaaaga tgcctcaatc 3840 
ccattttgat attcatttta gtgaaaggat gcatcagacc tgttccacat catgcacatg 3900 
ggaaagggtg gttatcattt tccttctaac aagtaggtac agatattcgg ttactacacg 3960 
tgcacctgta gcagtatttc tagaaacatc cctttttgtt gagaacctcc cttgaatgtc 4020 
tgtcacactc acacctgacg ggatggttac tggattagag agtagatttg gcacatcttt 4080 
tcttagtctt ttgattcaaa ttcaaaactt aacagcacaa accaggtcag agttactttc 4140 
ggttagaatt tattgccatt tattcctttt tataaatttc tatagattat actgttattt 4200 
ttatgttatt ggcctagagc tacacgtata tgggtttgtc ctgagtccgt tttcaaatga 4260 
ccttgtgata gggaaatggt tttgtccatg ttcttggaaa cacttgtgta tgtacagaag 4320 
gaagggaggg attatttttc tacaaagtaa tttatgattt ctaattttct aatgtgcctt 4380 
ggatatgtgc caaatgatgg aaaagaaaca gtaaacttta tgattcttaa aaaaaa 4436 

<210> 22 

<211> 2040 

<212> DNA 

<213> Homo sapiens 

<220> 

<221> misc_feature 

<223> Incyte ID No: 7131221CB1 

<400>.22 

cacagagtga acaagagaga gtcatttggg aaacaaaagg agaattttac agagagagag 60 
ggatagctaa aactacgtga gcctggcgag ggtgcagagc agaaagtaga gactgtccga 120 
agactgctat ctgggacgag acaagttgtt aaagggacag gagagaaagc agagctattt 180 
caagagtgag ccacagaagg gaatccagag gccatctaag cgaggaaggg tctacaggca 240 
gtgagtgaag gccaggagca gggcccaggc caggcacgac caccgagggg atgaacttca 300 
cagtgggttt caagccgctg ctaggggatg cacacagcat ggacaacctg gagaagcagc 360 
tcatctgccc catctgcctg gagatgttct ccaaaccagt ggtgatcctg ccctgccaac 420 
acaacctgtg ccgcaaatgt gccaacgacg tcttccaggc ctcgaatcct ctatggcagt 480 
cccggggctc caccactgtg tcttcaggag gccgtttccg ctgcccatcg tgcaggcatg 540 
aggttgtcct ggacagacac ggtgtctacg gcctgcagcg aaacctgcta gtggagaaca 600 
ttatcgacat ttacaagcag gagtcatcca ggccgctgca ctccaaggct gagcagcacc 660 
tcatgtgcga ggagcatgaa gaagagaaga tcaatattta ctgcctgagc tgtgaggtgc 720 
ccacctgctc tctctgcaag gtcttcggtg cccacaagga ctgtgaggtg gccccactgc 780 
ccaccattta caaacgccag aaggacaata gccggaggca gaagcagttg ttaaaccaga 840 
ggtttgagag cctgtgcgca gtgctggagg agcgcaaggg tgagctgctg caggcgctgg 900 
cccgggagca agaggagaag ctgcagcgcg tccgcggcct catccgtcag tatggcgacc 960 
acctggaggc ctcctctaag ctggtggagt ctgccatcca gtccatggaa gagccacaaa 1020 
tggcgctgta tctccagcag gccaaggagc tgatcaataa ggtcggggcc atgtcgaagg 1080 
tggagctggc agggcggccg gagccaggct atgagagcat ggagcaattc accgtaaggg 1140 
tggagcacgt ggccgaaatg ctgcggacca tcgacttcca gccaggcgct tccggggagg 1200 
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aagaggaggt ggccccagac ggagaggagg gcagcgcggg gccggaggaa gagcggccgg 1260 

atgggcctta aggcctgcgc cgacccgacc ctgctcgaga gcccgcgcta gagtcgggga 1320 

ggatctgcgc agagaccgca gcatcaccca aatcggcgcc ggccccggga ggatctcaat 1380 

aaagaactcg agcgtcccag acccgtatct cctttcgctg cccaaccccg cagcctgggc 1440 

ttcgaaggcg acccgcccac catcctgccc ttcccagaac ctgagaccgt ctggggggcg 1500 

gaagccaaat gaacccctat tgggcacctc tgtgatgtca ggagcgaact ggtgagccca 1560 

gcgccctggg aagagggccg agggcggggc ggtggtgccg ggacctctga ggtcctgggg 1620 

atttggggac ccttggggtc cacatgcacc tggctgacct ggctgaaagc cgctgtctcg 1680 

gagcccccca cagcattttg ttcccctccc gctggcccgg gggccccacc ttcccacggg 1740 

ttcccacgct gctgtgactg ccctgcctct acgacaaaag ccaacgggtc ttcagtactt 1800 

ttattaaaaa atagtcacgc agacagtgcc ctggtggctc tgccccgcat cccaactctg 1860 

cfcrgtggggga aaggggtcaa cgttttcgca gccccaaacc gggccatcac ttgcccaccg 1920 

agtcgaatat gatgcggttc tgctcggcgc gctcccgctg gctctgcgtc cgcgccagct 1980 

ccagcagggt ccgcagcagg tgaaaggtga ggtcaatgga cagagaaggg ttgtccgcgc 2040 



<210> 23 
<211> 2067 

<212> DNA 

<213> Homo sapiens 
<220> 

<221> misc^feature, 

<223> Incyte ID No: 7480551CB1 

<400> 23 

ggagctggga gggagcttta aggggcggac 
gagccggtgg gctcgttgtg ggcgccattt 
tcagccttgc tcggctcttc cccgctctgg 
gacaaaaatg ctgagtttct tccgtagaac 
agagaaggaa cgactccgag aagcacaacg 
ttctaagtcc atcatcacgt gtcgggtgtc 
cttgccaaaa aaagccaaag gacaagagtt 
gattgaaagc gactattttg gtctgagatt 
ggatggtaca aaaagcatca aaaagcaagt 
tcgagttaag ttttattcct cagaaccaaa 
atttgttctt cagttaaaac aagatattct 
agcagtgcaa ttggcagctt ataatctgca 
gcatagtcct gaacttgtct cagagttcag 
actggctatt tttgagaaat ggaaggaata 
caattatctg aataaagcca aatggctaga 
ggctagagat gggaatgact atagtttggg 
aggagatacc aaaattggct tatttttttg 
gaataaatta accttggtgg ttgtagaaga 
atttgtcttt agactggatc atccaaaagc 
gcatcatgct ttcttccgcc ttcgaggccc 
tattcgacta ggatcacgat ttagatatag 
caataaagca agaagatcaa catcctttga 
aactctacaa atgaaagcat gtgctacaaa 
ttcgacccaa agtaatggct cccaacaggc 
tccttccatt tcctctgctc ctgtgccagt 
aacagaccag catgacagga aatggctctc 
aaaccagtgg aacacaaggg ccttgccccc 
ctttgttcat gagcacaatg tgaagaatgc 
ccatacagcc atgactgaga tatgagtgtt 
tgcaagttga tggtatacat tatctggtgt 
tgggagaatt tacagtgagt cactagttgt 



sggcgggagg tcggggtcct ccggggatta 60 

ctcggcgtct cccgaggagc cgcccctttc 120 

tcgccggggc tgcgccgtcc ccagctcagt 180 

actagggcgt cggtctatgc gtaaacatgc 240 

cgccgccaca catattcctg cagctggaga 300 

ccttctggat ggtactgatg ttagtgtgga 360 

gtttgatcag attatgtacc acctggacct 420 

tatggattca gcacaagtag cacattggtt 480 

aaaaattggt tcaccctatt gtctgcatct 540 

taaccttcgt gaggagctaa cccggtattt 600 

cagtggaaaa ttagactgtc cctttgatac 660 

agctgaactt ggtgactatg atcttgctga 720 

attcgtgcct attcagactg aagagatgga 780 

cagaggtcaa acaccagcac aggctgaaac 840 

aatgtatggg gttgatatgc atgtggtcaa 900 

actaacacca acaggagtcc ttgtttttga 960 

gccgaagata accagattgg attttaagaa 1020 

tgatgatcag ggcaaagaac aggaacatac 1080 

atgcaaacat ttatggaaat gtgctgtgga 1140 

cgtccaaaag agttctcatc gatcaggatt 1200 

tgggaaaaca gagtatcaga ccacaaaaac 1260 

aagaaggccc agcaaacgat attctagacg 1320 

acctgaagaa cttagtgttc acaataatgt 1380 

ttgggggatg agatctgctc tgcctgtgag 1440 

ggagatagag aatcttccac agagtcctgg 1500 

tgctgccagc gactgctgtc aacgtggtgg 1560 

accccagacc gcacatagaa actacactga 1620 

aggaatccgt catgatgttc attttcctgg 1680 

gagcctctta ggctttggga ctctttgtca 1740 

ttataaagga ttaatcacat taggagtatt 1800 

tcagtgctgt ttgtaattga attcttccat 1860 
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gaaagggaca aggaatcaag gaagccatat agcatcaatg ataatgacaa atgtttgtgt 1920 
tgaaaagagt gtgtatacca ttgtggtttt ggaagagttt tcagacctta gtatgttcac 1980 
acatcaccag actgtatctc aggagaaggt ttgtgtttgt gaacaaggtg cccattattc 2040 
ccccaccaca tgccatccaa agagatc 2067, 

<210> 24 

<211> 1640 
<212> DNA 

<213> Hoioo sapiens 



<220> 

<221> inisc_f eature 

<223> Incyte ID No: 3315870CB1 

<400> 24 

gctgcaaacc ccactagcca gtgtcagcct 
agcaggggga gggctgtcaa attcgggagc 
tccgcttccc cggctcccgg cgtgacatct 
cgaaggagcg gaagaatggc agtgctcaaa 
atcttcagcg gtgatccaga ggagatccgg 
actctggatt ctgagaaacg aacccctctt 
atcattgaac tcctgatttt gtcaggagct 
actccactgc accgggctgt tgcttccaga 
cactcagctg atgtcaatgc aagggacaag 
gccaacaagg ctgtcaaatg tgcagaagtg 
tccgaccgag gggggcgcac agccttgcac 
gtcaatttac tcttggccaa aggggcaaat 
gctctgcact gggcagcata catgggccac 
ggcgcagaag tgacctgtaa ggataagaag 
aatggacaga ttaatgttgt caagcatctc 
aatgtctatg gaaatacagc gcttcacatc 
aacgagttga ttgactacgg tgctaacgtg 
ttgcattttg ctgctgcctc cactcatggt 
ggggcagatg ttaacattca gagtaaagat 
catggaaggt tcacacggtc acagaccctc 
gataaggacg gcaacactcc tctccatgtg 
aacaccttaa taaccagcgg agctgacaca 
aataattgcc tcaagtggga atactgccaa 
atttatccag gttaaggtgg atctatacca 
aattaagtgc cttttaatga cagaggcaaa 
caaatgaggt ctatttcagt ggttttaatt 
ctttctaata ttcttctgtc cttcccaatg 
actgagccaa -tttagaatgt 

<210> 25 
<211> 1497 
<212> DNA 

<213> Homo sapiens 



ctcggcggga ggaggcggcg gcggaggagg 60 
cagatttttt cccttctcct ggcaatccct 120 
gcgggccggg gacctgcatg tgtgtgcgcg 180 
ctcaccgacc agccaccatt ggttcaggca 240 
atgctcatcc ataaaactga agatgtgaat 300 
catgtggccg catttctggg agatgcagag 360 
cgtgtaaatg ccaaggacaa catgtggctg 420 
agtgaagaag cagtacaggt tttgattaag 480 
aactggcaga cccctcttca tgtggcagca 540 
atcattcccc tgctgagcag tgtcaatgtc 600 
catgcggctc tgaacggcca cgtggagatg 660 
atcaatgcat ttgacaagaa ggaccggcgt 720 
ttggatgttg tagcattgct cattaaccat 780 
ggttataccc ctctgcatgc tgcagcctcc 840 
ctgaacc.tgg gggtggagat tgatgaaatc 900 
gcctgctaca atggacagga tgctgtggtt 960 
aaccagccaa acaataatgg gttcacccct 1020 
gctt.tgtgtc ttgaattgtt agtaaacaac 1080 
ggcaaaagtc cactgcacat gacagctgtc 1140 
attcagaatg gaggtgaaat tgactgtgtg 1200 
gctgcaagat acggtcatga gcttttgatt 1260 
gccaagtagg ttaccgcaaa aatacggtgg 1320 
aaagattctt ccgtgcagta gatagttccc 1380 
ttatactaaa tacaaattaa attttaaata 1440 
gaagaaccaa ttttattttt tagcttcatc 1500 
aaggaaactt gaactttatt cgtaactttt 1560 
cttcatatta aacaggaaaa ataaaaccta 1620 

1640 



<220> 

<221> misc_f eature 

<223> Incyte ID No: 7484690C31 



<400> 25 

atgagggaaa tcgtgctcac gcagaccggg cagtgcggga accagatcgg ggccaagcag 60 
ttctgggagg tgatctctga tgaacatgcc atcgactccg ctggcaccta ccacggggac 120 
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agccacctgc cgctggagcg cgtcaacgtg caccaccacg aggccagcgg tggcaggtac 180 
gtgcctcgcg ctgtgctcgt ggatctggag ccgggcacca tggactccgt gcgctcgggg 240 
cccttcgggc aggtcttcag gccagacaac ttcatttccc gtcagtgtgg ggccggaaac 300 
aactgggcca agggacgcta caccgaaggc gcggagctga cggagtcagt gatggacgtt 360 
gtcagaaagg aggctgagag ctgtgactgc ctgcagggtt tccagctgac ccactccctg 420 
ggtgggggga ctgggtctgg^gatgggtacc cttctgctca gtaagatccg- ggaggagtac 480 
ccagacagga tcataaacac attcagcatc ctgccctcgc ccaaggtgtc ggacaccgtg 540 
gtggagccct acaacgtcac cctctcagtc caccagctca tagaaaacgc ggatgagacc 600 
ttctgcatag ataacgaagc gctatatgac atatgttcca ggaccctaaa actgcccaca 660 
cccacctatg gtgacctgaa ccacctggtg tctgctacca tgagtggggt caccacgtgc 720 
ctgcgcttcc cgggccagct gaatgctgac ctgcggaagc tggccgtgaa catggtcccg 780 
tttccccggc tgcatttctt catgcccggc tttgccccac tgaccagccg gggcagccag 840 
cagtaccggg ccttgactgt ggctgagctt acccagcaga tgtttgatgc taagaacatg 900 
atggctgccc gtgacccctg tcacggccgc tacctaacgg tggctgccat tttcaggggt 960 
cgpatgccca tgagggaggt ggatgaacag atgttcaaca ttcaagataa gaacagcagc 1020 
tactttgctg actggttccc cgacaacgta aaaacagccg tctgtgacat cccaccccgg 1080 
gggctaaaaa tgtcagccac cttcattggg aacaacacag ccgtccagga actcaagcgg 1140 
gtctcagagc agtttacagc aacgttcagg cgcaaggcct tcctccactg gtacacgggc 1200 
gagggcatgg atgagatgga attcac^tgag gccgagagca acatgaacga cttggtgtct 1260 
gaatatcagc aatatcagga tgccacggcc gagggaggag gagtatgagg aggaggaggt 1320 
ggcctagaac tctccttttc taggtaaagg ggggaagcag tgtggatcct tcactgtgtt 1380 
.ctgacagcca tgtgtcacta tgcgctcgtt catttgtgtc ttcacatctc ctgctgcatt 1440 
ttaaagcatt tttatagtat gcggttttgc ctaataaagt attctcacag cgaaaaa 1497 

<210> 26 

<211> 2065 

<212> DNA 

<213> Homo sapiens 

<220> 

<221> mis cofeature 

<223> Incyte ID No: 7612559CB1 

<400> 26 

ccgagatccg cgctctctac aacgtgctgg ccaaagtgaa gcgggagcgg gacgagtaca 60 
agcggaggtg ggaagaggag tacacggtgc ggatccagct gcaagaccgt gtaaatgagc 120 
tccaggagga agcccaggag gctgatgcct gccaggagga gctggcactg aaggtggaac 180 
agttgaaggc tgagctggtg gtcttcaagg ggctcatgag taacaacctg tcggagctgg 240 
acaccaagat ccaggagaaa gccatgaagg tggatatgga catctgccgc cgcatcgaca 300 
tcaccgccaa gctctgcgat' gtggctcagc agcgcaactg cgaggacatg atccagatgt 360 
tccaggtccc atccatgggg gggcggaagc gggagcgcaa ggctgccgtc gaggaggaca 420 
cctccctgtc ggagagtgag gggccccgcc agcccgatgg ggatgaggag gagagcacag 480 
ccctcagcat caacgaggag atgcagcgca tgctcaacca gctgagggag tatgattttg 540 
aggacgactg tgacagcctg acttgggagg agactgagga gaccctgctg ctttgggagg 600 
atttctcagg ctatgccatg gcagctgcag aggcccaggg agagcagcag gaagatagcc 660 
tggagaaggt gattaaagat acggagtccc tgttcaaaac ccgggagaag gagtatcagg 720 
agaccattga ccagatagag ctggagttgg ccacggccaa gaacgacatg aaccggcacc 780 
tgcacgagta catggagatg tgcagcatga agcgcggcct ggacgtgcag atggagacct 840 
gccgccggct catcacccag tctggagacc gaaagtctcc tgctttcact gcggtcccgc 900 
ttagcgaccg ccgccgccgc caagcgaggc tgaggactcc gatcgcgatg tcticatctga 960 
cagctccatg agatagagac ctgcctcccc cttgcacccg aggccctcgc agcagggagc 1020 
tcagcgaggc agagggtggg gctgcacaga ggggaacatc agctgcagct ctgcaccagg 1080 
ccggtccctg gggactgggg cgctcctccc tcaggctttc tccctcagtc ttggcttctc 1140 
cagggctctg gggtgtctgg agctaggctt ggccctacca ttctggggcc atttccacca 1200 
cagttggggc tctcctgcct tcacgcgtgg gtgtctgcta cttccccatc tttaaaatgc 1260 . 
tgccagagcg attgcggccc ctcaccttgt ccacgtatca ggaatgtgaa tgtgggacct 1320 
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ttcctccatc cctgttgtcc ggagccagct cactgtcttc cacactggtg ctaactggcc 1380 

caggcactgg agtggaatag aatgcagctg gaggctacgc atggcctctg cagcacacgc 1440 

agctggagag ggcttctgtc cctgtcagcg gcagagggcg ttggggctgg ccggggcacc 1500 

ttgtccctgc tatggtccac atgctcacgc tgtccacctg ccaggtggag tgtatgtggc 1560 

tgtggccctc jcctcgtggag gtgccgtgct ttaaagaggc cttagtgccc gggatgggca 1620 

cagtgttttg aagggaggtg ggagctcttg-ctGtcctggt cactgcagaa tgacagagaa 1680 

ggtgaagctc catgcatgtg tgcgcgggtg tatgtgcgct cagggtctct gtttaagtat 1740 

cagctaaaga tgtgcttcct ccgtgtctgt catacactga gaccaacagg ctacagtgtc 1800 

cctgattctt ggaaaagcct ggagaagctg gggagatgcg gttcacaatg cctcggtata 1860 

ggaggctgtg ttgagctgac attcaaatgg attctttaat aataatgaaa ctggcgagta 1920 

tttattgtgc actttggtgt ccctgtctcc agcacttcct aatattcact agtttgaact 1980 

ctgaggtagg tacttttttt ttttgagatg gagtctcata ctctgttgcc taggctggag 2040 

tgcagtggtg cgatcacagc tcgct 2065 

<210> 27 

<211> 762 

<212> DNA 

<213> Homo sapiens 

<220> 

<221> misc^f eature 

<223> Incyte ID No: 4940751CB1 

<400> 27 

gcagaggcag catagcagca gccagctcca tccatcctct ttcccctcct cgcttcgctt 60 
cctcggcgga ttcctcctcc ctcgacagtc cccgtcgccg tccccttccg gtgcgcaagt 120 
cgcccgagat ggcaaacgcg agatcgggtg tcgctgtgaa tgacgagtgc atgctcaagt 180 
tcggcgagct gcagtcgaag aggctgcacc gcttcctaac tttcaagatg gacgacaagt 240 
tcaaggagat cgttgtggac caggtcgggg atcgcgctac cagctacgag gacttcacaa 300 
acagcctccc cgagaatgac tgccgatacg cgatctatga tttcgacttt gtcactgcag 360 
aagatgtcca gaagagcagg atcttctata tcctatggtc cccatcctcc gccaaggtga 420 
agagcaagat gctttatgca agctcaaacc aaaaattcaa gagtgggctc aatggcattc 480 
aggtggaact gcaggctact gatgcaagtg aaatcagcct tgatgagatc aaggatcggg 540 
ctcgctaggc atcatcatga tcatgcatca tggacttggc ctactactgt ggatttgtat 600 
gccattatag acttggtgct gtgaaagact gcttgatgat ttgcgggttt gttgctgtgt 660 
aaaaaaaggt cccatggctc ccagaagacc atgaaggttc ggatctatca tgtaattcct 720 
tgttatctgc gaattaatgt atagtgttgc attggtcgcg tc 762 

<210> 28 
<211> 2211 
<212> DNA 

<213> Homo sapiens 
<220> 

<221> misc.feature 

<223> Incyte ID No: 7946761CB1 

<400> 28 

atgacctggg gcaccccgga ctttcttaat cgtagctcca cccactcgag ccgggtgcct 60 
tcgcgtttcc cgtttttaaa tgagatagtg gcacacccgg tggcatcctc ccacccgggc 120 
tcttatcggc ggtcccagac cctgcttgag cgcctccggg tgtcaagggc ccctgaggac 180 
actaaagctc tcgaaccccg atgtggaccc ccgtgcggcg cggggcagcc tggctgggaa 240 
ccctgctcgg ccctggagag gggccccccg agccgagggg aggagcggcg catgcccaca 300 
agccccccgg cgggaagtag gaaatcgacc gaccaggcgg tgcgcttcgg acccagccag 360 
ggcatgtgct cggaggcccg cctggctcgc aggttgcggg atgcgctgcg ggaggaggag 420 
ccgtgggcag tagaggagct gctgcgctgc ggcgcggacc ctaatttggt gctagaggac 480 
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ggcgcagcgg ctgtgcactt ggcggccgga gcccggcacc cgcgcggcct gcgttgcctc 540 
ggggccctac tgcgccaagg cggggacccc aacgctcgat ctgtcgaggc actgacgccg 600 
ctgcatgtgg ccgccgcgtg gggctgccgc cgcggcctgg agctgctgct gagccaagga 660 
gcggacccgg cgctgcgcga ccaggacgga ctccggccgc tggacctggc cctgcagcag 720 
ggacacctgg agtgcgcgcg agtcctgcag gatctcgaca cgcggaccag gacccggacc 780 
cggatcgggg cagagactca ggagcccgag cctgcacctg gcaccccagg cctctctgga 840^ 
cctaccgatg agacgctgga ctccatagca ctccaaaagc agccatgcag aggtgacaac 900 
agggacattg gcttggaggc tgacccagga ccccccagcc tccctgttcc ccttgaaact 960 
gtggacaaac atgggagctc ggcgtcccct ccagggcact gggattacag ctcagacgcc 1020 
tctttcgtca cagcggttga ggtctctgga gctgaggacc cagcctcgga cactcccccc 1080 
tgggctgggt cattgccacc gaccaggcag ggacttctgc atgttgtcca tgccaaccag 1140 
agggtaccta ggtctcaggg cacggaggca gaactgaatg cccgtctgca ggccctgact 1200 
ctgaccccac caaatgctgc tggcttccag tcctcccctt cctccatgcc tctcctggac 1260 
aggagtccag ctcatagccc cccacggaca ccaacccctg gagcttctga ctgccactgc 1320 
ctgtgggagc accagacatc cattgatagt gacatggcca cgctctggct gacagaggat 1380 
gaggcaagct ctacaggtgg cagggaacct gtcggccctt gccggcacct gccagtctcc 1440 
actgtgtctg acttggagtt gctgaaggga ctccgagcac ttggtgagaa tcctcacccc 1500 
atcacaccct tcaccaggca gttgtaccac cagcagctgg aagaagccca gattgctcca 1560 
ggcccagagt tttcagggca cagcctagaa ctggctgcag ccctgcggac gggctgtatt 1620 
ccagatgtcc aggcagatga agacgcgctg gcccagcagt ttgagcggcc agatcctgcc 1680 
aggaggtggc gggagggggt cgtgaagtct agcttcacgt atctgctgct ggaccccagg 1740 
gagactcagg acctgccagc ccgagccttc tcactgaccc cagctgagcg ccttcagact 1800 
ttcatccgtg ccatcttcta cgtgggcaaa gggacgaggg cccggccata tgtccacctc 1860 
tgggaggccc ttggtcacca tgggcggtca agaaaacagc cccaccaggc ctgccccaag 1920 
gtgcgtcaga tcttggacat ctgggccagt ggttgcggcg ttgtgtccct acattgcttc 1980 
cagcacgtgg tcgctgtgga ggcttataca cgggaggcgt gtattgtgga agccctaggg 2040 
atccagacgc tcaccaacca gaagcaaggg cactgctatg gagtggtggc aggttggcca 2100 
cctgctcgtc gccggcgctt gggggtgcac ctgctgcacc gtgccctcct tgtcttcctg 2160 
gctgaaggcg agcgacagct tcatccccag gacatccagg cccggggctg a 2211 

<210> 29 

<211> 1634 

<212> DNA 

<213> Homo sapiens 

<220> 

<221> misc_feature 

<223> Incyte ID No: 3288747CB1 



<400> 29 

ctcagctaag. ggtcagcatc ttatccccac 
tcacctacaa gtcgggagct gctgccaagg 
cagggggcag ctcatcctcc taccgagcag 
gtcggagcct ttacagcctg gggggtgccc 
gtgggtgggc aggaggctat ggatttggcc 
tgtttggcag tgtggccttg gggtccgtgt 
atcaggtcac catcaacaag agcctcctgg 
tccagaaagt gcgtgcccag gagcgggagc 
ccttcattga caaggtgcgg ttcctggagc 
agctgctaca gcagctggac ctgaacaact 
gctacatcag caacctgcgg aagcagctgg 
actcggagct gaggagcgtg cgcgaagtgg 
eiaataaacaa gcgcacaact gctgagaatg 
cagcttacac gagcaaagtg gagctgcagg 
agttcttcaa gtgtctgtac gagggggaga 
cgtccatcat cctgtccatg gacaacaacc 



tttctggcct ccccaccatg agccgccaat 60 
ggggcttcag cggctgctcc gctgtgctct 120 
ggggcaaagg gctcagtgga ggcttcagca 180 
ggagcatctc tttcaatgtg gccagtggca 240 
ggggccgggc cagtggcttt gctggcagca 300 
gtccgtcgtt gtgcccgccc gggggtatcc 360 
cacccctgaa cgtggagctg gaccctgaaa 420 
agatcaaggt gctgaacaac aagttcgcct 480 
agcagaacca ggtgctggag accaagtggg 540 
gcaagaataa cctggagccc atccttgagg 600 
agacgctgtc tggggacagg gtgaggctgg 660 
tggaggacta caagaagaga tacgaagaag 720 
aatttgtggt gcttaagaag gacgtggacg 780 
ccaaggtgga tgccctggat ggagaaatca 840 
ctgctcagat ccagtcccac atcagcgaca 900 
ggaacctgga cctggacagc atcattgctg 960 
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aggtccgtgc ccagtatgag gagatcgccc ggaagagcaa ggccgaggcc gaggccctgt 1020 

accagaccaa gttccaggag ctgcagctag cagccggccg gcatggggat gacctgaaac 1080 
acaccaaaaa tgagatctca gagctgaccc gtctcatcca aagactgcgc tcggagattg 1140 
agagtgtgaa gaagcagtgt gccaacctgg agacggccat cgctgacgcc gagcagcggg 1200 
gggactgtgc cctcaaggat gccagggcca agctggatga gctggagggc gccctgcagc 1260 
aggccaagga ggagctggca cggatgctgc gcgagtacca agagcttttg agcgtgaagc 1320 
tgtccctgga tattgagatc gccacctacc gcaagctgct ggagggcgag gagtgcagga 1380 
tgtccggaga atataccaac tccgtgagca tttcggtcat caacagctcc atggccggga 1440 
tggcaggcac aggggctggc tttggattca gcaatgctgg cacctacggc tactggccca 1500 
gctctgtcag' cgggggctac agcatgctgc ctgggggctg tgtcactggc agtgggaact 1560 
gtagcccccc agtggtcagc aatgtcacca gcacaagtgg cagctctggc agtagccgtg 1620 
gagtttttgg aggg 1634 

<210> 30 

<211> 4706 

<212> DNA 

<213> Homo sapiens 

<220> 

<221> misc^feature 

<223> Incyte ID No: 8200016(31 

<400> 30 

catgtgaaca ccaattagag ctgactattc ccgggattgt ggtactcggg gctgtgtcaa 60 
tcaagggtgc tacaatagca cgtgcaccag tggtgcctca agacccaccg gggagaggct 120 
tatcttaact ccagctgccg aatgagaatg agtttgaagc tttttgcagg atcatggaac 180 
agagcctcca tgcaatagtg catcctgagg taaactgtta cctgagtaag ggctttaagt 240 
aatgcatttc ctgggaacga cagttgtgac agaagagaat gctggaaccc gtagcaagat 300 
tcctgtctga gatggaaaga tgtctcacta tcattttatc aagtgctgtt gctttcagct 360 
atgtaacgtt tttcgatccc atgagatgga aatcgaccag tgct;tgctag agtcccttcc 420 
ccttggccaa cggcagcgtc tagtgaagcg catgcgctgt gagcaaatca aagcctacta 480 
tgagcgcgag aaggcttttc agaagcagga agggttcctg aaaaggctga agcatgcgaa 540 
gaatccgaaa gttcacttca acctcacgga catgctacag gacgcgatta tccaccacaa 600 
tgacaaagaa gtgcttcggc tcctgaagga gggggcagac ccccacaccc tcgtctcctc 660 
gggagggtcc ctgctccatc tgtgtgctcg gtatgataat gccttcattg cagaaattct 720 
gattgacaga ggagtcaacg tcaaccacca ggatgaagac ttctggacgc ccatgcacat 780 
tgcctgtgcc tgcgataacc ctgatattgt cctgcttctt gtattagctg gagccaatgt 840 
ccttctccag gatgtgaatg gaaatatccc attagattat gctgtagaag ggacagaatc 900 
cagctctatc ctgttgacct atctggatga aaatggagtg gatttgacct cactgcgcca 960 
gatgaagctt cagagaccaa tgagtaCgtt aacagatgtc aaacacttct tatcatctgg 1020 
aggaaatgtc aatgagaaaa acgatgaagg agtaaccctg ttacacatgg cgtgtgcgag 1080 
tggctacaag gaggtggtgt ctcttatcct ggaacatggt ggagacctca acatagtaga 1140 
tgatcagtac tggactcccc tccacttggc agccaaatat ggccagacaa atctggtgaa 1200 
acttctcctg atgcatcagg caaacccaca cctcgtgaac tgtaatgagg agaaggcgtc 1260 
agatattgct gcctctgagt ttattgagga aatgctgctg aaagccgaaa ttgcctggga 1320 
agaaaaaatg aaagagcctt tatctgcttc taccttagct caagaagagc cctatgaaga 1380 
gatcattcac gatcttcccg tactgtcgag taagctcagt cccctggtgt taccaattgc 1440 
caagcaagac agtttgttgg aaaaagacat tatgttcaaa gatgcaacaa aaggtctgtg 1500 
taagcagcag tctcaggaca gcatccctga aaaccccatg atgagcggtt ccaccaaacc 1560 
cgagcaggtc aagctaatgc ctcctgcccc aaacgatgac ctggcaacgc tcagcgagct 1620 
caatgatggc agcctgctct atgagattca gaagcgcttt gggaacaatc agate tatac 1680 
attcattgga gacattcttt tgcttgttaa cccatacaag gagcttccaa tttattcttc 1740 
catggtgtcc cagctgtatt tcagctcctc agggaagctg tgttcctcgc tgcctcctca 1800 
cctcttctcc tgtgtggaga gagcctttca ccagctcttc cgggaacagc ggcctcagtg 1860 
tttcatcctc agtggagaaa ggggatcagg aaagtctgaa gccagcaaac aaatcataag 1920 
acacctcacc tgcagggctg gcgccagcag ggccacactg gattccagat tcaaacatgt 1980 
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cgtgtgcatc ttagaagcct ttggacatgc 
cttcatcaag tattttgaac tgcagttctg 
aatttataca tatttgctag agaaatccag 
ttttctcatt ttctacttgt tgatggatgg 
tcttaataat ttatgtgcac accggtattt 
aggggagcgt tctctgaaca gggagaaafct- 
tggcttcagc agcttggagg tggagaatct 
tggagacatt cggtttactg ccctgaatga 
gctcctggaa caagtggctg gaatgttaca 
aacaactgat attcaatatt ttaaagggga 
tgctgagttt ttccgagacc tcttggccaa 
ggtgaatacc atgaattctt gcctccacag 
ggatattgga atattggaca tttttggttt 
actttgtgtc aacatgacca atgagaagat 
ccacgagcaa gtggaatgtg tacaagaggg 
taaccagaat ggagttttgg actttttttt 
ggatgaagaa agtcaaatga tttggtcagt 
tctcctagaa tcctcaaaca caaatgcggt 
tgttgccctc aaagaccacg gtacagcctt 
gtatgatgtt gttggggcga ttgaaaaaaa 
tgtaatgaaa actagtgaaa atgtcgtgat 
aacaggatcc ctcgtatctg cctatccttc 
gctcagtaag aaaatgacag cttcttcaat 
tagtaagtta ttaaaaaaga aaggaacttc 
tccagtcacc atagcatcac aactcaggaa 
gaagtgcact ccacacttca ttcattgcat 
ttttgataat ttttacgtgt ctgctcagct 
gatcttccga tatggatacc ctgttcgcct 
gccactggct gatacattcc tgcgtgagaa 
acttgttctc cagcagtgta aattacaagg 
aaaatactgg catgctgacc aactcaatga 
aacctgccaa aaagttatca gaggattttt 
catcagacaa caagaggtga cttctatcaa 
gctgaaaacc tacgatgccc tggtcattca 
ccggctccgt agtgaaatga acgctcccta 
gcaagaggaa ggaagcaaaa gaaccgatga 
ctccatgtca gtctgcgcgg ccgtggatgg 
ctggtctcct tcgctgcact cggtgttcag 
acggaaacag cccccgccca agccaaagag 
tgaggctgtg agcgcctgcc tctccgcggc 
gggagggacc cagcctcgtg ttccgggctc 
catggcagta ctgtcgccct aatgtattct 
cagcagctct taatcattaa atataaatat 
aagctactta catggcattt ccttaatccc 
tcattcagaa agtcggagtt attcagttaa 
gtcggctcct tccaccttta aattac 

<210> 31 

<211> 3029 

<212> DNA 

<213> Homo sapiens 

<220> 

<221> misc_f eature 

<223> Incyte ID No: 3291962CB1 



caagaccaca cttaatgatt tgtccagttg 2040 
tgagaggaaa caacagctaa ccggagccag 2100 
acttgtttca caacctcttg gccagagcaa 2160 
gttatctgct gaagaaaaat atggacttca 2220 
gaaccagacc atacaggatg atgcatccac 2280 
ggctgttttg aaacgagccc tgaatgtagt 2340 
gttcgtaatt ctagcagcaa tattgcacct 2400 
ggggaactcc gccttcgttt ctgacctcca 2460 
agtatcaaca gatgaattgg catctgcctt 2520 
tatgataata cgacgacata ccatacagat 2580 
gtccctgtac agtcgtttgt ttagcttttt 2640 
tcaagatgaa cagaaaagca tgcagacatt 2700 
tgaagagttt caaaagaatg aatttgaaca 2760 
gcaccactat atcaatgaag tgctttttct 2820 
agttaccatg gaaacagcat attctgctgg 2880 
ccagaagcca tctggatttc tcaccttatt 2940 
ggaatcaaat tttccaaaaa aactacaaag 3000 
gtactccccc atgaaggatg ggaatgggaa 3060 
caccatcatg cactacgcag gaagggtaat 3120 
taaagactcc ctttcacaga atcttctatt 3180 
caatcatttg ttccagtcga aattgtcaca 3240 
ctttaaattc cgaggacata agtctgccct 3300 
tattggagaa aacaagaatt atctagaact 3360 
tacatttctt caaagattgg aacgaggaga 3420 
atcactaatg gatattattg gaaaacttca 3480 
caggcccaat aactcaaagc tgccagatac 3540 
acaatatatt ggggtcctgg agatggtgaa 3600 
ttccttctcg gatttcctgt caaggtataa 3660 
gaaggaacag tcagctgccg agcgatgtcg 3720 
ctggcagatg ggagtccgaa aagtgtttct 3780 
tttgtgccta cagttgcaga gaaaaattat 3840 
agcacgccag cacctgcttc agagaatgag 3900 
tagctttctg cagaacacag aggacatggg 3960 
gaatgcttca gacattgccc gggaaaatga 4020 
ccataaagag aagttagagg tcaggaacat 4080 
caagagtgga cccaggcatt tccaccccag 4140 
cctgggccag tgcctcgttg gcccgtccat 4200 
catggatgac agcagcagcc tcccgtctcc 4260 
ggaccccaac acccggctga gtgcttccta 4320 
cagggaagcg gccaacgaag gtcagccttg 4380 
gcgcatgctc tgacttcgcc ttggggcgcc 4440 
taatagaaat aaatccaatt gttggcttgc 4500 
atttattcaa tctctaagcc tcttagggaa 4560 
atccccaacc tgctccaaga gcagtatcaa 4620 
ggtcccatgg gaagttccca aaaaaaaacg 4680 

4706 
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<400> 31 

ctggctggag gttgacacag gagtgctcag 
agcatcgtcc ttgctgaaaa aatggcagag 
aaccggcatt tccagctcca ggactacaag 
aagctgacca aggacaaggc cctgctggcc 
ctgaaaacgg agagctacgt ccaggcagct 
tcctcggaca tcaaggctct gtatcggcga 
gaccaggcct tcaaagacgt gcagcgttgt 
caggagatgc tgaggagact caacaccagc 
acagactcga gggtacagaa gatgtttgag 
aagcgggaaa aggctgccaa caatctcatt 
aagatcttcc agaacaatgg agtagccttg 
gagctggtgc tggctgcagt gcggaccctg 
gccacagtga ttctgcatgc agtgcggata 
aatgaggaga tgtctctggc tgtctgcaac 
ggggaggaca agcgggagca tcgagggaag 
gacctgaagc agatcaccag ccacctgctg 
cagggcaggg atcaggcgct gaacctgctc 
attcatgaca actcacgtac catctatgtg 
gttgtggggc aggttccaga tctgccatcc 
ctggcctcta tcctcatcaa caagctctat 
cacttccgca agatctgtga ggaatatatc 
aagaacttga atgccatcca gacagtgtca 
aaccagctgc tgggactgaa aggtgtgatg 
cgcgagacgg accagctggt ggccgtggag 
cgcgccacct tcatcatcac caatggagtg 
aaaaatgaga agatcaagat ccgcacactg 
ggcacagact acggtctcag gcagtttgcg 
tgtcgcaagt ggctgtgcaa tatgtccata 
ggcctggcct acctcacgct ggacgctgat 
gccctgcagg ccatgtttga gctggccaag 
gccaccaccc tggtgaactg caccaacagc 
gtccagctcg ccaagttctc caagcagcat 
gactttatag acatgcgggt gaagcggctt 
tgcatggtga aagcagatag tgccatcctc 
gtattcctgg cactgtgtga caacccaaag 
ggcaaggccc tgattcccct ggctttggag 
cacgctctag caaagatcgc tgctgtctcc 
gtgtatgagg tggtgcggcc ccttgtaaga 
aactatgagg ctctcctagg cctcaccaac 
aagatcttta aggagagggc cttgccagac 
cagctgcggc aggcggccac cgagtgcatg 
gaaaggttct tggctgacgg gaatgaccgg 
gatgatgata aggtgcagaa tgcggctgca 
aagaaactgt gcctcaagat gactcaagtg 
ctttgcctgc acgaccagct gtctgtccaa 
ctggcagccg atgctgagct ggccaagaag 
actgtggtgg gcaaacagga gccagatgag 
gaatgtctca tcaagtgcat ggattatggt 
ctccgggatg ctgggagtgg tcctgtactg 
gaagagtcag gtcatctagg gatcatagca 
acttgattgt tctctgaaaa aaaaaaaaa 

<210> 32 
<211> 2074 
<212> DNA 



gggagcagca tcacaagagg gcagatcgaa 60 
gtggaagcgg tacagctgaa ggaggaagga 120 
gccgccacaa atagctacag ccaggccctg 180 
acgctttatc ggaaccgggc agcctgtggc 240 
tcagatgcct"*'ccagagccat cgacatcaac 300 
tgccaggcac tggagcacct ggggaagctg 360 
gccaccctcg agccacggaa ccagaacttc 420 
attcaggaga aactccgagt gcagttctcc 480 
atcctcttgg atgaaaacag tgaggctgat 540 
gtcctaggcc gtgaggaagc aggggctgag 600 
ctactgcagc ttctggacac taagaagcct 660 
tcgggcatgt gcagcggcca ccaagccaga 720 
gaccgaatct gtagcctcat ggccgtggag 780 
ctgctccaag ccatcattga ctccttgtct 840 
gaggaggccc tggttctaga caccaagaag 900 
gacatgctag tcagcaagaa ggtgtctggc 960 
aataagaatg ttcccaggaa ggaccttgcc 1020 
gtggataatg gtctgaggaa gatcctgaag 1080 
tgcctgcccc tgactgacaa cacccgcatg 1140 
gatgacctgc gctgtgaccc ggagcgcgat 1200 
acgggcaagt ttgaccccca ggacatggac 1260 
gggatcctgc agggcccctt tgacctgggc 1320. 
gagatgatgg tggcactatg tggctcagag 1380 
gccctcatcc atgcctccac gaagctcagc 1440 
tcactgctca aacagatcta caagaccacc 1500 
gtgggactct gtaagctcgg ctctgcaggt 1560 
gaagggtcga cagaaaaact ggccaaacag 1620 
gacactcgga cccgacgctg ggcagtggag 1680 
gtgaaggacg actttgtcca ggacgtccct 1740 
accagtgaca agaccatcct gtactcggtg 1800 
tacgatgtca aggaggtcat cccagagctt 1860 
gtgcccgagg aacaccccaa ggacaagaag 1920 
ctgaaggcgg gtgtcatctctgccctggct 1980 
actgaccaga ccaaggagct gctggccagg 2040 
gaccgaggca ccattgtggc tcaaggtggt 2100 
ggcacagatg tgggcaaggt gaaggcagcc 2160 
aatccggaca ttgcttttcc tggggagcgg 2220 
ctcttggaca cacagaggga tgggcttcag 2280 
ctgtctgggc ggagtgacaa actccggcag 2340 
atcgagaact acatgtttga gaatcatgat 2400 
tgcaacatgg tgctccacaa ggaggtacag 2460 
ctgaagctgg tggtgctgct ctgcggggag 2520 
ggggctctgg ccatgctgac agcagcacac 2580 
acaacccagt ggttggagat cctccagcgg 2640 
caccggggcc tggtcattgc ctacaaccta 2700 
ctggtggaga gtgagctgct ggagatcctg 2760 
aagaaggcag aagtggttca gacagcccga 2820 
ttcattaaac cagtgtctta gacagcgacc 2880 
tgcagagtcc tgggttggtt gggttctcct 2940 
gtgacaatga agtctcaata taaaggaaag 3000 

3029 
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<213> Homo sapiens 
<220> 

<221> misc^feature 

<223> Incyte ID No: 1234259CB1 

<400> 32 

ctctcgcgag gacggacgcc attatcgcat ctccccgaca aacaccacga gaattccgca 60 
gcccacacgg tgaccaaaag ccagccccac tgtgagttga actctttcgt gttgaccggc 120 
cactctcctg tgctctggat gatgtcggaa cacgacctgg ccgatgtggt tcagattgca 180 
gtggaagacc tgagccctga ccacccagtt gttttggaga atcatgtagt gaeagatgaa 240 
gacgaacctg ctttgaaacg ccagcgacta gaaatcaatt gccaggatcc atctataaag 300 
tcattcctgt attccatcaa ccagacaatc tgcttgcggt tggatagcat tgaagccaaa 360 
ttgcaagccc tggaggctac ttgtaaatcc ttagaagaaa agctggatct ggtcacgaac 420. 
aagcagcaca gccccatcca ggttcccatg gtggccggct cccctctcgg ggcaacccag 480 
acgtgcaaca aagtgcgatg cgtcgtcccc cagactacag taatactcaa caatgatcgg 540 
cagaacgcca ttgtagccaa gatggaagac cccttgagca acagggcacc ggattccctg 600 
gaaaatgtca ttagcaacgc tgtgcctggg cgtcggcaga acaccattgt ggtgaaggtg 660 
. ccgggccaag aagacagcca ccacgaggac ggggagagcg gctcggaggc cagcgactct 720 
gtgtccagct gtgggcaggc gggcagtcag agcatcggga gcaacgtcac gctcatcacc 780 
ctgaacfccgg aagaggacta ccccaatggc acctggctgg gcgacgagaa caaccccgag 840 
atgcgggtac gctgcgccat catcccctcc gacatgctgc acatcagcac caactgccgc 900 
acggccgaga agatggcgct aacgctgctg gactacctct tccaccgcga ggtgcaggct 960 
gtgtccaacc tctcggggca gggcaagcac gggaagaagc agctggaccc gctcaccatc 1020 
tacggcatcc ggtgtcacct tttctataaa tttggcatca cagaatccga ctggtaccga 1080 
atcaagcaga gcatcgactc caagtgccgc acggcgtggc ggcgcaagca gcggggccag 1140 
agcctggcgg tcaagagctt ctcgcggaga acgcccaact cgtcctccta ctgcccttca 1200 
gagccgatga tgagcacccc acctcctgcc agcgagctcc cgcagccaca gccgcagccg 1260 
caggccctgc actacgcgct ggccaacgca cagcaggtgc agatccacca gatcggagaa 1320 
gacggacagg tgcaagtaat cccacaggga cacctccaca tcgcccaggt gccgcagggg 1380 
gagcaagtcc agatcacgca ggacagcgag ggcaacctcc agatccatca cgtggggcag 1440 
gacggtcagc ttctagaggc cacccgcatc ccctgcctcc tggccccatc cgtcttcaaa 1500 . 
gccagcagtg gccaggtgct gcagggtgca cagctgatcg ccgtggcctc ctcggacccc 1560 
gcggcagcgg gcgtggatgg gtcgccactc cagggcagcg acatccaggt tcagtacgtg 1620 
cagctggcgc cagtgagtga ccacacggcc ggggcacaga cggccgaagc cctgcagccc 1680 
acgctacagc cggagatgca gctcgagcac ggggccatcc agattcagtg agcggtgccc 1740 
atggcaccag gagcccctcg ccggctccgc ctacggcccg gcccccacgc gccctgctct 1800 
cacggcctcg gcacaggcag cggctgcacg tgttctgctg aagtgcgtct gaaggccgct 1860 
gcctccgcgg ggaacagcat cctatcaact gaaagagcag ccgccgccgc ccccagccgg 1920 
agaccccttt cgtttgagtc ctgctgttgg tgtcggagca cgaggggagg cacggtgcgg 1980 
agagcgtcgc atatgcgcgg gaaatcaaga actatgatat ttttctgttt aaacagcttt 2040 
ttttaatttg ctatggtgtt tataacaaaa aaaa 2074 

<210> 33 
<211> 2710 
<212> DNA 

<213> Homo sapiens 
<220> 

<221> xnisc_feature 

<223> Incyte ID No: 1440608CB1 



<400> 33 

atggccaagt ttgccctgaa tcagaacctg 
gtccccgccg ccgggggcgc acgcagcccg 
ggcttccacc tggacctgga cttcctcaag 



cccgacctgg gcggcccccg cctgtgcccg 60 
agctcgccct actcggtgga gacgccctac 120 
tacatagagg agctggagcg tggccccgct 180 
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gcccgccgcg ccccgggacc cccgacctcg cgccgtcccc gcgcgccccg gcccggcctc 240 
gcgggcgcac gtagcccagg cgcctggaca tccagcgagt ccctggccag tgacgacggt 300 
ggagcaccgg gcatactctc ccagggcgcg ccctcggggc tcctgatgca gccgctgtcg 360 
ccgcgcgcgc ccgtgcgcaa cccgcgcgtc gagcacacgc tccgggagac cagccggcgg 420 
ctggagctgg cgcagacaca cgagcgcgcg cccagccccg gccgcggggt cccgcgcagc 480 
ccacgcgggt ccggccgcag cagccccgcc cctaaccttg cccctgcttc gcccggccct 540 
gcccaactgc agctggtgcg cgagcagatg gccgcggcgc tgcggcgcct gcgcgagctc 600 
gaggaccagg cgcgaacgct gcccgagctg caggagcagg tgcgcgcgct gcgcgccgag 660 
aaggcgcggc tgctggccgg gcgcgcgcag cccgagccgg acggggaggc tgagacgcgc 720 
ccggacaagc tcgcccagct gcggcggctc accgagcgcc tggccacctc cgagcgcggc 780 
ggccgtgcca gggccagccc ccgggctgac agcccagacg gcctggctgc agggcgcagc 840 
gagggcgcgc tccaggtcct cgacggggag gtcgggagtc tcgatgggac gccccagacc 900 
cgggaggtgg ccgccgaggc cgtgcccgag acccgagaag cgggtgccca ggccgtgccg 960 
gagacccggg aggccggcgt ggaggctgcc cccgagaccg tggaggcgga cgcgtgggtg 1020 
accgaggcgc tgctggggct gcctgcggcc gccgagcgcg agctagagct gctgcgcgcc 1080 
agtctggagc accagcgcgg ggtgagtgag cttctgcggg gccggttgcg ggagctggag 1140 
gaagcccgcg aggctgcgga ggaggcagcg gcgggggccc gggcccagct acgcgaggcc 1200 
accacccaga ccccgtggag ctgtgccgaa aaggccgcgc agaccgagtc cccggcagag 1260 
gcgccctcct tgactcagga gagctcgccc ggatccatgg acggagacag ggccgtggcg 1320 
cccgcgggca tcctcaaatc catcatgaag aagagagacg gcacacctgg tgcccaaccc 1380 
agctccggac ccaagagcct gcagtttgtt ggggtcctca acggagagta cgagagctcc 1440 
tccagcgagg acgccagcga cagcgatggc gacagcgaga acggtggcgc cgagcccccg 1500 
ggtagctcct cgggctccgg ggatgacagc ggcgggggat ccgactcggg cacccctggc 1560 
cctcccagcg gcggggacat ccgggaccct gagcccgagg cggaggcaga gcctcagcag 1620 
gtggcacagg ggaggtgcga gctgagcccg cgtctgaggg aggcgtgcgt agcgctgcag 1680 
cggcagctga gccggccccg cggagtagcc agcgacggcg gcgcagtgcg cctcgtggcc 1740 
caggagtggt ttcgagtgtc cagccagcgg cgctctcagg cggagcccgt ggccaggatg 1800 
ctggaagggg tgaggcgcct gggacccgaa ctgctggcgc acgtggtgaa cctggcggat 1860 
ggcaacggga acacggccct gcactacagt • gtgtcccacg ggaacctggc catcgcaagc 1920 
ctgctcctgg atacgggggc ctgcgaggtc aaccgccaga accgagccgg ctactcggcc 1980 
ctcatgctgg ctgcactcac ctctgtgagg caggaagagg aggacatggc tgtggtccag 2040 
agactcttct gcatgggtga tgtcaatgcc aaggccagtc agacggggca gacagccctc 2100 
atgctggcca tcagccatgg ccgacaggac atggtggcaa ccctactggc gtgtggggct 2160 
gatgtgaatg cgcaggatgc ggatggggcc acagcgctga tgtgtgccag tgagtatggg 2220 
cgcctggaca ccgtgcggct gctgctcacc cagccaggct gtgaccctgc catcctggac 2280 
aatgagggca ccagtgccct ggccatcgcc ctggaggctg agcaggatga ggtggccgct 2340 
ctgctacatg cccacctgag ctcgggccag cccgacaccc agagcgagtc accccctggc 2400 
tcccagacag ccacacctgg tgaaggagaa tgcggtgaca atggagagaa cccccaggtt 2460 
cagtaagctg cctcgtctgg ctcactacac ctagctgtgg ggagatctcc tcgtcagtca 2520 
cctcagcctt tggcgcacag aagggtccag ggtcccctgc tcagaggcta acactggccg 2580 
aagagaaagg caatttcagt tggggtgact gtggcaggaa ggggctcact ctggccccac 2640 
caaggtgagg tggggaccaa gtgatagagc cctgatccac ccactctctg aaacttcttt 2700 
gctaataaaa 2710 

<210> 34 

<211> 3527 

<212> DNA 

<213> Homo sapiens 

<220> 

<221> iidsc_f eature 

<223> IncytG ID No: 3413610CB1 

<400> 34 

atggccagga gaggtaagaa gcccgtggtg agaacgctgg aggatctgac gctggactcg 60 
ggttatggtg gcgcggcgga ctcggtgcgc tcctccaact tgtctttgtg ctgttccgac 120 
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tcgcacccgg cgtccccgta tggcgggagc 
agccggcaca acagctttga cactgtcaac 
gactgcgccg gccagcactg ctcgcggctg 
ctccaggagc tggaggcgct gctgctgcgt 
cccggcggcc tgcccaagga cgcgctggcc 
gtgcgcatag ccaaagaggc-gcagcgcctg 
gagatccaga gcgccatgga gatcgtgctg 
gctgcgctgg ccgcactgtc cctctacaac 
cgcggcaagt cggcccgctg cggcctcacc 
gtggacagcc gcgtggcgct gcgcatccac 
atggagagcc tcttccggga catctactcg 
tgcagtggcc ctgggtcagg ctcgggctcc 
gcccccgcgg cggataaaga gcgggaggcg 
tgcagcgcag ccagcagcgc cagcgggggc 
gccgccgcag tcccgccgac agccgccgcc 
cacgaggcgc ccaagttcac cgtggagacc 
atctggggtc tcctgcagcc ctaccagcac 
ctggtgtccc gtgcaatgca tcacctgcag 
ctgccgccgc tgatggagtg gatccgggtg 
ttctccatgg acagcgacga cgtccgccag 
tgcgagccgc gccagctcag ggccgacgac 
gtggccatcg aagccaagtt taagcaggac 
acagacctgg tgaagcaggc agtgtctctg 
gaacagggca tgactcccct gatgtatgcc 
atgctgctgg atgccggagc tgacctgaat 
ccatccgtcc accccgagac ccgccattgg 
catattcctg tagttcagct cctcctggat 
catggcgagg agaactactc ggaaacaccc 
gagctggtta gtttgctgtt ggagcgtggt 
aatggaattt ctacaacccc ccagggtgat 
ggacacagga atgtgttccg caaactgctc 
ctgtccctgg aggagattct ggccgagggg 
ttgtgcgcca gccgcaacag caaggccaaa 
agcgctgagc atggctacgt ggatgtcaca 
actctgcaca cgtggctgga gtctttgcgg 
atccagtgct tgttaaagga gtttaagacc 
gttacccaag gcctgcccct gatgtttgag 
agccagcagc tgtgcgtcat cttcacacac 
acagaaatca aacggaaaca gacctcgcgc 
atgtctgatg ttacatttct ggtagaagga 
tttacagcct ctccaaggtt caaagcactc 
tgcatagaga ttggttatgt gaaatactcc 
tatggtggcc * cagagtcact gctcattaaa 
gctaagtttt tccagctgga ggctttgcag 
atcaataccg acaactgtgt ggatatttac 
ctctcagcat attgcgaagg ctactttctc 
gcattcaagc agctcctgta tgacaaaaat 
gacttacaga ggacgttggc catcagaatt 
tccgtggtat gaaacgccta gtgcagggaa 
cgcattggct ttacacaaac acagacaaat 
aggagctgcc tctactgctc ccacgtgttc 
ctgcagatca gatcagctgg gtccagagtt 
caattgaaag cacccctagg accattgaac 
tggattgagg ccttttaaag gtcactcagg 
taccaaccct gaagagattg tattacacat 
cgtcaaccat ggtagcaaat tggtgaggct 



tgctggccgc ctctagctga ctccatgcac 180 
actgccctgg tggaagactc cgaggggctg 240 
ctgccggacc tagacgaggt cccctggact 300 
tcgcgggatc cccgggcagg cccggcggtc 360 
aagctgtcga cgctggtgag ccgggcgctg 420 
agcctgcgct tcgccaagtg-caccaagtac 480 
tcctggggcc tggccgcgca ctgtacggcg 540 
atgagcagcg ccggcggcga ccgcctgggc 600 
ttctccgtgg gccgcgtgta tcgctggatg 660 
gagcacgccg ccatctacct gacagcctgc 720 
cgggtcgtgg cctccggggt gccccggagc 780 
ggcccaggcc cgagctcggg ccctggtgcg 840 
cccgggggag gagcggcgag cggcggcgcc 900 
agcagctgtt gcgccccgcc ggccgccgcg 960 
aaccaccacc atcaccacca ccatgcgctc 1020 
ctggagcaca cggtcaacaa cgactcggag 1080 
ctgatctgcg ggaagaacgc cagcggtgac 1140 
cccctccagg tggaaaggcc cttcctcgtg 1200 
gccgtggcgc acgccggcca ccgccgcagc 12 60 
gcggcccggc tgctgctgcc cggcgtggac 1320 
tgcttttgtg catctcgaaa gctggatgcg 1380 
ctgggtttcc ggatgctgaa ctgtggacga 1440 
ctggggcccg atgggatcaa caccatgagc 1500 
tgcgtccgtg gggacgaggc gatggttcag .1560 
gtggaggttg tcagtactcc tcataaatat 1620 
acggctctga cttttgctgt gttgcatgga 1680 
gctggggcca aggtggaagg ctcagtggag 1740 
ctccagctgg cagctgctgt aggaaatttt 1800 
gccgatcccc tgataggaac catgtacagg 1860 
atgaactctt tcagccaggc tgcagcccac 1920 
gcccagccag agaaggagaa gagtgatatc 1980 
actgacctgg cggagacagc cccgcccccc 2040 
ctgagggccc tgagggaggc catgtatcac 2100 
attgatatca ggagcatagg cgtcccgtgg 2160 
atcgccttcc agcagcaccg caggcctctc 2220 
attcaggagg aggaatacac ggaggagctc 2280 
atcctgaaag cgagcaagaa tgaagtgatc 2340 
tgctacgggc cctaccccat ccccaagctc 2400 
ttggatcctc attttcttaa caataaagaa 2460 
agaccatttt atgctcacaa agtgctgtta 2520 
ctctccagca agccgacaaa tgatggcacc 2580 
atctttcagc tggttatgca gtatctctac 2640 
aacaatgaga tcatggagct tctgtctgct 2700 
cgacactgtg agattatctg tgcgaaaagc 2760 
aaccatgcca agtttcttgg agtcacagag 2820 
aaaaacatga tggtcctcat tgaaaacgaa 2880 
ggtgaaggga ccggccagga tgtgctccag 2940 
cagtccatcc acttgtcgtc ttccaaaggt 3000 
tgcttcccgg gaactttcca gttctcctgc 3060 
tccacctggc acctgttttt ggctgggcca 3120 
ctgttgaaaa acaaaggact ttccactggt 3180 
taatgggcaa ctggacaacc aagttaaccc 3240 
acccactgcc ggggaccact gtccagtgaa 3300 
ttccaggttg acagttggag gacttcaccg 3360 
taaggacctt ggtagctgtg cttcagcaaa 3420 
gtgaccaata atgaggaaat aatctggcaa 3480 



46/48 



wo 02/053719 



FCTAJS02/00178 



atttttaggg gtgggaactt ttttaaatgt 

<210> 35 

<211> 3251 

<212> DNA 

<213->- Homo sapiens 

<220> 

<221> inisc_f eature 

<223> Incyte ID No: 3276394CB1 

<400> 35 

cggcgtcaga gacactgcga gcggcgagcg 
agccgctgcg gggccgcgaa caaagaggag 
ggatgttaca tgagtcattt taagggatgc 
tctcagtaaa gtagataaag atggatgaat 
tgtgtctaga gcgccttgat gcttctgcga 
agcgatgttt gctggggatc gtaggttctc 
ctcttgttgg ctcgggtgtc gaggagcttc 
atggcatcaa acagaggcct tggaaacctg 
caaatgcatt aaggtctcag agcagcactg 
gctcccaggg cggacagcag cctcgggtgc 
ctcagttacc atgtgccaaa gcattataca 
aattcagcaa aggtgacatc atcattttgc 
gggaagtcaa tggaatccat ggctttttcc 
tacctcagcc cccatctcag tgcaaagcac 
cagacaaaga ttgccttcca tttgcaaagg 
atgaaaactg ggctgaagga atgctggcag 
ttgagtttaa ctcggctgct aagcagctga 
ttgatgctgg agaatgttcc tcggcagcag 
acaccaagaa gaacaccaaa aagcggcact 
cctcccaggc atcccagaac cgccactcca 
ccagcaaccc cactgctgct gcacggatca 
cttctcaggt tcatataagt accaccgggt 
tgacaactgg cccctcgttt actttcccat 
ctttgaatcc tcctcttcca ccaccccctc 
caccaggcgc caccgccgct gctgctgctg 
ccactgacca gattgcaciat ttacggccgc 
atccatacac tcctcggaaa gaggatgaac 
tgtttgagcg ctgccaggat ggctggttca 
gggttttccc tggcaattat gtggcaccag 
ctaaagtccc tatgtctaca gctggccaga 
ccacggcagg agggcctgcc cagaagctcc 
ttgtccccgc agctgtggta tcagcagctc 
tgttgcacat gacggggcaa atgacagtca 
cagcgcacaa ccaggaacgc cccacggcag 
ccggcctcag ccctgcatct gtgggcctgt 
cgcctctgat gccaggctca gccacgcaca 
cccctctggc ctgtgcagca gctgctccac 
tggaggctga gcccagtggc cggatagtga 
acagtgcttc atcagcttgt gggaacagtt 
aagaaaaaaa gggtttgttg aagttgcttt 
tgtctcctcc agcatcgccc accctagaag 
agggagcggt ggggcccgaa ctgccaccag 
ctgtggacgg ggacggaccg gtcacgactg 
cttttcatag gaaggcaagt tccctggact 



tcatttaaaa aaaaaaa 3527 



cggtggggcc gcatctgcat cagccgccgc 60 
gagccgaggc gcgagagcaa agtctgaaat 120 
acacaactat gaacatttct gaagattttt IBO 
cagccttgtt ggatcttttg gagtgtccgg 240 
aggtcttgcc ttgccagcat acgttttgca 300 
gaaatgaact cagatgtccc gagtgcagga 360 
ccagtaacat cttgctggtc agacttctgg 420 
gtcctggtgg gggaagtggg accaactgca 480 
tggctaattg tagctcaaaa gatctgcaga 540 
aatcctggag ccccccagtg aggggtatac 600 
actatgaagg aaaagagcct ggagacctta 660 
gaagacaagt ggatgaaaat tggtaccatg 720 
ccaccaactt tgtgcagatt attaaaccgt 780 
tttatgactt tgaagtgaaa gacaaggaag 840 
atgatgttct gactgtgatc cgaagagtgg 900 
acaaaatagg aatatttcca atttcatatg 960 
tagaatggga taagcctcct gtgccaggag 1020 
cccagagcag cactgcccca aagcactccg 1080 
ccttcacttc cctcactatg gccaacaagt 1140 
tggagatcag cccccctgtc ctcatcagct 1200 
gcgagctgtc tgggctctcc tgcagtgccc 1260 
taattgtgac cccgccccca agcagcccag 1320 
cagatgttcc ctaccaagct gcccttggaa 1380 
tcctggctgc cactgtcctt gcctccacac 1440 
ctggaatggg accgaggccc atggcaggat 1500 
agactcgccc cagtgtgtat gttgctatat 1560 
tagagctgag aaaaggggag atgtttttag 1620 
aagggacatc catgcatacc agcaagatag 1680 
tcacaagggc ggtgacaaat gcttcccaag 1740 
caagtcgggg agtgaccatg gtcagtcctt 1800 
agggaaatgg cgtggctggg agtcccagtg 1860 
acatccagac aagtcctcag gctaaggtct 1920 
accaggcccg caatgctgtg aggacagttg 1980 
cagtgacacc catccaggta cagaatgccg 2040 
cccatcactc gctggcctcc ccacaacctg 2100 
ctgctgccat cagtatcagt cgagccagtg 2160 
tgacttcccc aagcatcacc agtgcttctc 2220 
ccgttctccc tggactcccc acatctcctg 2280 
cagcaaccaa accagacaag. gatagcaaaa 2340 
ctggcgcctc cactaaacgg aagccccgcg 2400 
tggagctggg cagtgcagag cttcctctcc 2460 
gaggtggcca tggcagggca ggctcctgcc 2520 
cagtggcagg agcagccctg gcccaggatg 2580 
ccgcagttcc catcgctcca cctcctcgcc 2640 
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aggcctgttc ctccctgggt cctgtcttga atgagtctag acctgtcgtt tgtgaaaggc 2700 
acagggtggt ggtttcctat cctcctcaga gtgaggcaga acttgaactt aaagaaggag 2760 
atattgtgtt tgttcataaa aaacgagagg atggctggtt caaaggcaca ttacaacgta 2820 
£Lt:gggaaaac tggccttttc ccaggaagct ttgtggaaaa catatgagga gactgacact 2880 
gaagaagctt aaaatcactt cacacaacaa agtagcacaa agcagtttaa cagaaagagc 2940 
acatttgtgg acttccagat ggtcaggaga tgagcaaagg attggtatgt gactctgatg 3000'' 
ccccagcaca gttaccccag cgagcagagt gaagaagatg tttgtgtggg ttttgttagt 3060 
ctggattcgg atgtataagg tgtgccttgt actgtctgat ttactacaca gagaaacttt 3120 
tttttttttt aaagattttt gactaaagtg gccgattgtt ttccgggtta actaatttat 3180 
tgggttttta acttgaactt tcggtaaaaa aaaaagctgg ggaaatggtt tggaaatttt 3240 
attttgaaag g 3251 

<210> 36 
<211> 1600 
<212> DNA 

<213> Homo sapiens 

<220> 

<221> misc^f eature 

<223> Incyte ID No: 7602049CB1 

<400> 36 

gctccacgca gcccggctgg gcagcaaggg acagaacaga ggcggccgct gacagcacca 60 
gcatgtctta cagtgtgacc ctgactgggc ccgggccctg gggcttccgt ctgcaggggg 120 
gcaaggactt caacatgccc ctcactatct cccggatcac accaggcagc aaggcagccc 180 
agtcccagct cagccagggt gacctcgtgg tggccattga cggcgtcaac acagacacca 240 
tgacccacct ggaagcccag aacaagatca agtctgccag ctacaacttg agcctcaccc 300 
tgcagaaatc aaagcgtccc attcccatct ccacgacagc acctccagtc cagacccctc 360 
tgccggtgat ccctcaccag aaggtggtag tcaactctcc agccaacgcc gactaccagg 420 
aacgcttcaa ccccagtgcc ctgaaggact cggccctgtc cacccacaag cccatcgagg 480 
tgaaggggct gggcggcaag gccaccatca tccatgcgca gtacaacacg cccatcagca 540 
tgtattccca ggatgccatc atggatgcca tcgctgggca ggcccaagcc caaggcagtg 600 
acttcagtgg gagcctccct attaagg^cc ttgccgtaga cagcgcctct cccgtctacc 660 
aggctgtgat taagagccag aacaagccag aagatgaggc tgacgagtgg gcacgccgtt 720 
cctccaacct gcagtctcgc tccttccgca tcctggccca gatgacgggg acagaattca 780 
tgcaagaccc tgatgaagaa gctctgcgaa ggtcaaggga aaggtttgaa acggaacgta 840 
acagcccacg ttttgccaaa ttgcgcaact ggcaccatgg cctttcagcc caaatcctta 900 
atgttaaaag ctaaaaggct gcctggaatc cccccacccc aacaggctgg actccctcca 960 
tccttacccc cacacagatc tggcatgtga gccccacggt gatgcttgac aatgtataac 1020 
tctgctgggg gcacctctga tggccaaccg cagcatttct gtcctctgcc caccccagag 1080 
ctgatgctgg ggcccagccc cctgcagctc tgtacccacc aaacctcccc agggcaaccc 1140 
tcgccacccc ccaaatagcc cgtagcccaa tcccctgccc tctgcacagg gccttagctg 1200 
tagaccagag agggcaggag gggtttgctg gcataacacc ccagaaccaa gggaaatgga 1260 
tgggccgctg ctcagtttcc caccatcctc agctcctggc ctcatcccct cctagaatga 1320 
gtcacccgta gatcagggtc tggggaagag gctgatccct ggcgctgccc ggctccctcg 1380 
ctgccctctg gagctcaggg cagcccggaa tagggctctt tgaagaggaa gtagaagccc 1440 
cagggtaatg aggcagagac ccctcctggc agtggtgagg tgggggcatg caccctcctt 1500 
tctgtaccgt gtgtgctggc tccatagttc tctcttctgt acatateiagc atgcttgttc 1560 
tgaaataaag aagatttgaa gtgaaccaca aaaaaaaaaa 1600 
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